Arhitectural Operations in Cloud
Arhitectural Operations in Cloud
Arhitectural Operations in Cloud
Computing
Ragnar Skúlason
ARCHITECTURAL OPERATIONS IN CLOUD COMPUTING
Ragnar Skúlason
Advisors
Klaus Marius Hansen
Helmut Wolfram Neukirchen
Faculty Representative
Hjálmtýr Hafsteinsson
Bibliographic information:
Ragnar Skúlason, 2011, Architectural Operations in Cloud Computing, MSc thesis, Faculty of
Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of
Iceland.
Abstract
Rapid scalability is important in cloud computing in order to serve growing communi-
ties and optimize hardware costs. This scalability can be hard to achieve, especially in
software with static architecture. Changing software architecture of running systems on
multiple devices over the Internet is a hard and delicate process as updating live soft-
ware can cause faults and failures while software systems are being restarted. Taking the
study of software architecture to the dynamics of the cloud computing can be beneficial
in this case and increase cloud computing possibilities.
The Architectural Scripting Language (ASL) is a language for expressing the dynamic
aspect of run-time and deployment-time software architecture. In the following thesis,
ASL is taken to cloud computing which enables dynamic software architecture changes
to meet the dynamics of a computing infrastructure. We present Cloud ASL, which is
an external domain-specific language which enables architectural operations and archi-
tectural scripting in cloud computing environments. Cloud ASL is modeled and tested
by the creation of a distributed cloud computing ray tracing system which was built to
utilize Cloud ASL for its distributed and cloud computing mechanism.
Cloud ASL is a framework which enables modelling dynamic aspects of runtime software
architecture with architectural operations in cloud computing and suitable to use for
creating a scalable and modifiable cloud computing software.
Preface
I have been interested in software development and in particular web development and
distributed development for a decade. I spent the academic year 2009-2010 as an ex-
change student at the University of California, Berkeley, and by coincidence I ended
up in a cloud computing course. After attending this course I came very interested in
the subject, so much that when I arrived back in Iceland to do my MSc thesis I started
by looking for possible cloud computing based projects. After discussing project ideas
with professor Klaus Marius Hansen we settled with this very interesting research topic,
which joined cloud computing and software architecture.
vii
Contents
List of Figures xi
List of Tables xv
Acronyms and Abbreviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Acknowledgements xxi
1. Introduction 1
1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3. Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Background 3
2.1. Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1. Essential Characteristics . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2. Service Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3. Deployment Models . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2. Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.1. Architectural Qualities . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.2. Architectural Description . . . . . . . . . . . . . . . . . . . . . . 19
2.2.3. Architectural Prototype . . . . . . . . . . . . . . . . . . . . . . . 25
2.3. Architectural operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1. Architectural Change . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.2. Architectural Scripting and Architectural Operations . . . . . . . 28
ix
x
4. Evaluation 69
4.1. Qualitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.1.1. Utility and Completeness . . . . . . . . . . . . . . . . . . . . . . 69
4.1.2. Quality Attribute Scenarios . . . . . . . . . . . . . . . . . . . . . 71
4.2. Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.1. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.2. Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Bibliography 101
2.6. Model View example, package overview of the web messenger software
system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
xi
xii
4.1. A scatter chart of startup timing of Cloud ASL script vs. manual script . . 90
4.2. A scatter chart of operations timing of Cloud ASL script vs. manual script 91
4.3. Bar chart of ASL performance, rendering efficiency with different amount
of workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
xiii
4.4. Error chart of ASL performance, rendering efficiency with different amount
of workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
2.1. Economy of scale in 2006 for medium-size data center vs. very large data
center [5, 6] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
xv
xvi
4.10. updated UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.11. New UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
xvii
Acronyms and Abbreviation
AD Architectural Description
xix
xx
UI User Interface
VM Virtual Machine
Acknowledgements
I would first of all like to thank my advisor, professor Klaus Marius Hansen, for the
great guidance he provided me with and what often seemed like unlimited knowledge in
our fields of study. I would like to thank my second advisor, associate professor Helmut
Wolfram Neukirchen, for joining us in the later period of the work and managing the
last parts of the thesis defense.
I also want to thank the University of Iceland Research Fund for allowing me to work
on this project full time, I want to thank Amazon Web Services for giving me funds to
use their public cloud, Reiknisstofnun Háskóla Íslands for a valid try to give us access to
cloud computing services through their hardware, and GreenQloud for giving us access
to their available infrastructure.
Lastly I want to thank all the faculty of the computer science department of University
of Iceland for their dedication to the science.
Thanks
Ragnar Skulason
xxi
1. Introduction
In the following chapter the subject of the thesis will be introduced. First, I motivate the
work by describing some of the problems treated in the thesis. Secondly, the challenges
will be summarized in a problem statement and finally the attempts at solving them will
briefly be outlined.
It is assumed that the reader of this thesis has a background in computer science and pos-
sesses general knowledge of distributed systems and software architecture; some knowl-
edge of architecture operations, cloud computing, the Java OSGi framework; and com-
puter graphics.
1.1. Motivation
The work of this thesis is done as an attempt to solve this problem and ease deployment
and architectural change. Enabling software developers to manage software architec-
ture dynamically on top of a cloud computing infrastructure can simplify deployment
processes significantly. This can give software developers a useful tool to update software
and scale software systems which will benefit cloud computing development. Building
complex scripts or programs to deploy and update software on clouds would be replaced
by simple architecture scripts that can be reviewed and tested.
1
2
Changing software architecture can be important for scalable software, as for any high
available software. Any change in the software architecture can be a difficult task and
would require shutting down old software components and starting new software com-
ponents, resulting in software downtime. Architectural scripting could be a solution to
this problem. It could allow architectural modification on runtime software, resulting
in minimum downtime in general support for architectural change could be available in
a cloud computing environment, enabling dynamic scaling of cloud computing software
architectures.
In this thesis, we argue that modeling dynamic aspects of runtime software architecture
with architectural scripting complements cloud computing tools/techniques; especially
Infrastructure as a Service (IaaS). Architectural scripting can be the basis of managing
and modeling the dynamic aspects of architectural change.
The Architectural Scripting Language (ASL) has been the focus of two studies, Hydra [8]
and Alloy [9] but has not been implemented in other cases. Therefore the question
arises: Can architectural scripting be tailored for cloud computing infrastructure? And
if so, how would we implement architectural scripting on cloud computing infrastruc-
ture?
The theoretical and historical background of the concepts this thesis is based on will be
described in chapter 2. Chapter 3 demonstrates the implementation of our approach
and the process of building it. In chapter 4, the work will be evaluated. Chapter 5 will
discuss our work and in the last chapter and we will summarize the work we did for this
thesis.
2. Background
The work in this thesis can be categorized into software architecture and distributed
computing. In this chapter, we will present related work in these areas, mainly cloud
computing, software architecture and its subset architectural operations. Through the
rest of the thesis, the contents of this chapter will be used as input to design and to govern
discussions.
‘Cloud computing’ is a relatively new concept in the computing world, although the idea
has existed for a longer time. A few years after the dot-com bubble, companies like Ama-
zon started leasing out their underutilized and unused hardware with cloud computing
technology, resulting in cloud computing gaining attention and popularity within the
computing industry. Cloud computing has become a viable option in recent years for
several reasons. The “web 2.0” shift can be named as an example, as providers are shift-
ing their services from localy stored and hosted services to external services. A few years
ago a web company would have needed to host and maintain its own billing system and
payment gateway, making long-term and expensive contracts with credit card compa-
nies, banks, security companies, etc. With the emergence of companies like PayPal and
Chargify, any individual can now accept credit cards without a contract or long-term
commitment and use the services on a pay-as-you-go basis. As the Internet has become
a part of almost any household and is viewed as a commodity everyone has access to,
companies can move their software services more securely to the internet, and by that
make cloud computing platforms a possible choice. The economy of scale is greatly in
favor of cloud computing. Cloud users can lease computing power from anywhere and
cloud providers can cut their prices by investing in huge data centers and potentially
save on each server, which makes this service interesting to users, such as small and
medium-sized companies. Table 2.1 shows the cost difference in medium vs. very large
data centers. Companies also have valid reasons to offer cloud computing services. The
big computing companies have similar financial incentives as users, as they can buy and
operate computing instances at a fraction of the price small or medium-sized compa-
nies do and resell them at higher costs. Vendors like Microsoft also need to defend their
franchise by offering cloud computing services on their platform to leverage their users
3
4
Cost in medium sized data Cost in Very large data center ratio
center
Network $95/Mbps $13/Mbps 7.1
Storage $26.00/GB/year $4.6/GB/year 5.7
Administration 140 servers/admin > 1000 servers/admin 7.1
Table 2.1.: Economy of scale in 2006 for medium-size data center vs. very large data
center [5, 6]
The concept “utility computing”, which cloud computing is in many ways based on, has
been the vision of computer scientists for decades [11]. In the 1960s, the American
computer scientist John McCarthy stated that
“If computers of the kind I have advocated become the computers of the
future, then computing may someday be organized as a public utility just as
the telephone system is a public utility... The computer utility could become
the basis of a new and important industry” [12]
Utility computing is based on the concept of computing resources as a utility, just as wa-
ter, gas and electricity are. One pays for one’s computing resources (CPU, data storage,
data transfer etc.) while using them. Like tap water, the user turns on his computing re-
sources and when finished using them, turns them off and only pay for the amount used.
This type of computing service has not been available until a few years ago. One of the
factors that influences companies to become cloud computing infrastructure providers is
the accessibility of underutilized computing resources [10], and for Amazon their web
service (AWS) started for their internal operations [13]. The on-line book retailer Ama-
zon is a good example of a cloud service provider. Amazon needs to be able to provide
hardware for its peak time usage, which is only fully used a few times per year. On a
regular basis the hardware is sitting at a very low utilization, or on average at 10% [14]
of its capacity, to leave room for occasional spikes. What Amazon did to use their under-
utilized hardware was start leasing their computing power to users on an hourly basis.
Users are now able to buy a virtualized computing instance, with their software and op-
5
erating system of choice, and be billed by the hour [14, 15] as result.
Cloud computing has been a buzzword in the computing industry recently and has been
gaining a lot of atraction, see figure 2.1.
Figure 2.1.: Cloud computing (blue) vs. grid computing (red) trends1
The term “cloud computing” is a recent concept and it therefore has no single definition
that has been accepted by cloud computing users. However, there are a few key principles
that are generally accepted as central to cloud computing, and the differences between
definitions is usually not great. In this thesis we are going to use a definition made by
the UC Berkeley RAD Lab [10] which states:
and the definition from the National Institute of Standards and Technology (NIST) [16]:
1
Google Trends, http://trends.google.com, accessed: 29. July 2010
6
tion. This cloud model promotes availability and is composed of five essen-
tial characteristics, three service models, and four deployment models.”
These two definitions complement each other and can be used jointly or separately.
Whereas the UC Berkeley definition defines the cloud computing essentials, the NIST
definition defines key elements of cloud computing, or five characteristics, three service
models and four deployment models. These elements will be reviewed below.
In the dawning of cloud computing its definition was disputed and many different com-
puting services defined themselves as cloud computing services, on September 25, 2008
Larry Ellison, CEO of Oracle, argued [17]:
“The interesting thing about cloud computing is that we’ve redefined cloud
computing to include everything that we already do. I can’t think of anything
that isn’t cloud computing with all of these announcements. The computer
industry is the only industry that is more fashion-driven than women’s fash-
ion. Maybe I’m an idiot, but I have no idea what anyone is talking about.
What is it? It’s complete gibberish. It’s insane. When is this idiocy going to
stop?”
We’ll make cloud computing announcements. I’m not going to fight this
thing. But I don’t understand what we would do differently in the light of
cloud computing other than change the wording of some of our ads. That’s
my view.
The NIST definition of Essential Characteristics of Cloud Computing lists those charac-
teristics that are required of a service to make it qualify as true “Cloud Computing”, in
other words, if it doesn’t do this, it isn’t Cloud Computing:
On-demand self-service
Just like electricity, a consumer can provision computing power on-demand such
as computing instances, networking or storage, without human interaction.
Resource pooling
Resources are shared within the cloud. This means that numerous clients may be
using the same set of resources at the same time, and that clients have no control
7
Rapid elasticity
Resources can be rapidly end elastically provisioned, giving clients opportunities
to quickly scale up or scale down. To the consumer, the resources seem unlimited
and can be purchased in any quantity at a time.
Measured Service
The cloud provider acts like any utility provider who measures and bills the amount
of service provided.
The real highlight of cloud computing is its versatility and adaptability. One can catego-
rize the service provided by cloud computing in three classes, Infrastructure as a Service,
Platform as a Service and Software as a Service where each service category can be lever-
aged independently or consumed in combination with other service tiers:
8
Google Apps
Software
FreshBooks
as a Service
SalesForce
Amazon EC2
Infrastructure
Rackspace cloud
as a Service
Eucalyptus
Infrastructure as a Service
Infrastructure as a Service (IaaS) is one of the service models that cloud computing
is based on. IaaS delivers computer infrastructure to clients as a service, usually in
the form of a virtualized platform. Users can lease server- or networking hardware
as a fully outsourced service instead of purchasing it. Examples of IaaS services
are Amazon EC22 , Rackspace Cloud3 and the Eucalyptus Private cloud software
infrastructure [18]. The Amazon IaaS will be described in more detail as it is the
service we ended up using and it is one of the leading cloud computing service
providers on the market, therefore it is the model many cloud providers follow.
Amazon offers multiple cloud computing services. Examples are Relational Database
2
Amazon Elastic Cloud Computing. http://aws.amazon.com/ec2
3
http://www.rackspacecloud.com
9
Services, Simple Queuing Services, Mechanical Turk, CloudFront and EC24 . All
these services, except EC2 can be though as SaaS or PaaS, see below, but Amazon
EC2 is their main IaaS service. With Amazon EC2, a user can lease a virtualized
computing instance for any time period, from minutes to years. These comput-
ing instances are virtual machines powered by the XEN hypervisor5 and bundled
with a customized operating system and software. The operating system can be
anything supported by Amazon, but the operating system’s kernel needs to in-
teract with the XEN hypervisor. Therefore there are limited amount of kernels
available, but there are many pre-installed software bundles available, both open
source and commercial. Users can select multiple sizes of computing instances,
from 1 core to 33.5 EC2 CPU cores, 613 MB to 68.4 GB virtual memory, up to 1TB
per-volume hard drive storage and both 32 bit and 64 bit platforms in preselected
instance types6 . When users request an instance, they first need to select the size of
instance needed, then machine image (operating system and bundled software),
next a public/private key-pair to access the virtual machine and lastly the security
group which defines allowed firewall rules. They then get a public IP address and
DNS name from which they can access the instance. This gives them administra-
tor rights to that virtual computer which they can use to install any software on
and use in any way they prefer. When users have finished using the computer in-
stance, they can terminate it and only pay for the amount of time used. All data
stored on the instance will be destroyed unless copied to more permanent storage.
Amazon’s EC2 includes three standard instances, which include different amounts
of CPU memory and are priced differently, see table 2.3. On these instances the
client can choose from multiple virtual images to be pre-installed. These images
include an operating system (e.g. with Debian-based Linux, RedHat-based Linux
and Windows) with different types of software pre-installed (e.g. database, batch
processing, or web hosting software). The user can even create a customized vir-
tual image which can be installed on these instances.
An example of an IaaS service user is GoGoYoKo8 , a new Icelandic on-line music
store which sells and streams music in digital audio format and allows users to
4
See http://aws.amazon.com/ for details about each service
5
The XEN hypervisor is hardware virtualization layer created by the University of Cambridge Computer
Laboratory and licensed under the GNU General Public License. http://www.xen.org
6
See http://aws.amazon.com/ec2/instance-types/, accessed 12.04.2011
7
Amazon EC2 quotas, http://aws.amazon.com/ec2/pricing/, accessed: 10. June 2010
8
gogoyoko music store - Fair Play in Music. http://www.gogoyoko.com/
10
listen to music on-line as well as being a social network site. This site is entirely
hosted on Amazon’s EC2. What is gained by using an IaaS service for this kind of
company are low costs of entrance in terms of hardware, rapid scaling, a reduced
number of initial employee and simplified operations. When startup companies
use cloud computing initially they do not need to invest initial capital in estimated
future hardware requirements. Instead they set up their environment and ser-
vices on few small instances, and when they are ready to go public they simply
increase the running instances or the number of instances. This way the company
can use the capital to make their services better and have a better service to provide
when launched. If the company does not need scalability in their computing envi-
ronment and the computing requirements are quite stable, then cloud computing
would probably not be a financially viable option.
Most startup companies do not gain great popularity on day one, instead it takes
time to gain publicity. At some periods of times they can get very high traction
in a short time, for example if their website is published in the news or gets good
publicity on social networking sites. When this happens, a company will receive
a huge usage spike for a short period of time. If it is using a cloud computing
environment, it can increase its computing power almost instantly and when the
computing load reduces, they simply release some of the computing power. In this
way a company does not need to have all the hardware required to handle this kind
of spike, and would not lose possible customers because of a lack of service if its
hardware could not handle the traffic.
For most new companies it can be very hard and bothersome to recruit capable em-
ployees. However, by not hosting their own hardware and all computing related
interactions being done through the web, they can reduce the number of admin-
istrators needed which can ease some of the start-up human resources problems.
Companies can become constrained by their data centers. Let us take as an ex-
ample a telecommunication company (telco) that starts up in a certain location.
This company uses regular hardware and sets up its data-center at its starting lo-
cation. At a future point in time the telco can have increased in size enough so
that its original location can not hold its operation anymore. It will need to move
to a larger location and will face a hardware problem. Moving a live and oper-
ating data-center can be extremely expensive and might not even be possible in
some cases. This can scatter the company and its administration into multiple lo-
cations and make regular daily operation harder than necessary. This problem can
be solved by using cloud computing.
Platform as a Service
Platform as a Service (PaaS) is another service model of cloud computing. PaaS is a
layer above IaaS in figure 2.2. PaaS delivers a computing platform and/or solution
stack as a service and is often consumed by the IaaS layer and can consume the
11
SaaS layer (see below). Compared to IaaS, a virtual machine as a service, PaaS can
be viewed as programming language environment as a service, e.g. Java Virtual
Machine as a service.
Examples of PaaS service models are Google App Engine9 , which supports the Java
and Python programming languages; Microsoft Azure10 , which supports the .Net
programming framework and Heroku11 , which supports the Ruby programming
language and the Rails framework.
The Google App Engine (GAE) service supports Java and Python and it virtualizes
applications across multiple servers and data centers. GAE only supports Google-
specific data storage and database engines. Like previously stated, programs can
be written in Java, or other JVM languages such as Groovy, JRuby, Scala, Clojure,
Jython, a special version of Quercus, and in Python with Python web frameworks
that run on the Google App Engine such as Django, CherryPy, Pylons, web2py and
Google’s own web app framework.
The programs written for the Google App Engine must use Google-supported
APIs and many common APIs are not supported, e.g. the Java Thread API. In
contrast to Amazon Web Services where you can set up you own database on a
virtual instance or use a Amazon driven non-relational database, Google also only
supports its own non-relational database, based on Google BigTable.
The Google App Engine is free of charge for minimal usage, but for more usage, a
user pays for the CPU time consumed by their software, data it transfers and data
stored. The CPU time is calculated in the number of hours of a 1.4 GHz processor,
running on full capacity. The prices for billable resources are as shown in table 2.4.
The main differences between PaaS and IaaS are ease of scalability, flexibility,
data lock-in and simplicity. For PaaS there is not much need for the user or the
9
Google App Engine. http://code.google.com/appengine/
10
Windows Azure Platform. http://www.microsoft.com/windowsazure/
11
Heroku, Ruby Cloud Platform as a Service. http://heroku.com/
12
Google App Engine quotas, http://code.google.com/appengine/docs/quotas.html, accessed: 10. June
2010
12
PaaS critics have been warning users about data and functionality lock-in [10].
If a user creates a program for the Google App Engine for example, the user is
restrained to using the Google App Engine, as no other provider supports that
functionality directly. The data stored in the supported database would also need
to be directly modeled and set up for this kind of database. Moving the data from
GAE would therefore require transformation which again could become hard and
expensive. Although there have been a few projects [19, 20] aiming at creating an
open source implementation of GAE, they do not provide all the features of GAE
and there is a high risk of these platforms not being stable enough for commercial
computing.
The complexity of IaaS in comparison to PaaS is a factor also. Users of IaaS services
need to know how to work with and configure the underlying operating systems
and middleware, and be sure the their software is scalable enough. With PaaS this
is not a factor and therefore PaaS can be simpler to use in the long run.
Software as a Service
Software as a Service (SaaS) has been regarded as the main aspect of cloud com-
puting, but it is more a product of cloud computing than a definition of cloud
computing. SaaS delivers applications as a service over the internet, usually in the
form of web pages, dismissing the need of installing and running the program on
the user’s computer. SaaS is usually hosted on PaaS or IaaS. Examples of SaaS
services are Google Apps13 , Chargify14 and SalesForce15 .
On Google Apps, a SaaS user can register for the service for free or pay for pre-
mium service, which includes support, more storage and a higher uptime guar-
antee. The user can then access an online email client, online office suite, which
includes spreadsheet, presentation and word processing software and more.
13
Google Apps, not to be confused with Google App Engine. http://www.google.com/apps/
14
Recurring billing SaaS solution. http://chargify.com/
15
Customer Relationship managemnt (CRM) SaaS service. http://www.salesforce.com/
13
There are three primary cloud deployment models. Each can exhibit the previously listed
characteristics; their differences lie primarily in the scope and access of published cloud
services, as they are made available to service consumers.
Private clouds
Private cloud infrastructures are operated on the infrastructure and used by a sin-
gle organization. This enables an organization to use existing hardware as a cloud
computing resource. This can be managed by the organization itself or by any third
party. This can be very useful if the organization owns its own hardware which it
wants to enable for cloud usage. An example of a private cloud is a university’s
cloud. The University of Iceland owns a cluster which is set up as a grid, but a grid
is limited to the software currently running on it and can be more complicated for
the end user to operate than cloud instances. Work has been done in setting up
the Eucalyptus16 IaaS software on this grid, which would enable users (students or
faculty) to get computing instances of their own where they can set up their own
software without the need of administrators.
Community clouds
Community cloud infrastructures are cloud infrastructures which are shared by
several organizations that serve a certain community with shared concerns. Sim-
ilar to private clouds, they can be managed by these organizations or by any third
party. An example of a community cloud is NEON17 , or the North European cloud
computing project, which was a cloud computing project of NDGF18 . Several north
European universities and research facilities share their computing facilities on a
very large grid. The idea of the NEON project is to evaluate the possibility of using
these grids for a large community cloud. If so, private cloud infrastructures will be
set up on each university’s or research center’s grid and these clouds will then be
shared as one common community cloud.
Public clouds
Public cloud infrastructures are available to the general public or large industry
groups. Public clouds are usually owned by a single organization that sells the ser-
vice. Examples of public clouds are the Amazon EC2 and Rackspace clouds
Hybrid clouds
When two types of clouds are connected or used they are termed hybrid clouds.
For example if a private cloud facility has limited amount of resources, when the
16
The Eucalyptus private IaaS sytem. http://www.eucalyptus.com/
17
NEON, Northern Europe Cloud computing. http://www.necloud.org/
18
Nordic DataGrid Facility, http://www.ndgf.org/
14
demand of resources is greater than those available, they can scale to a public
cloud, instead of having resources for peak usage it switches to a public cloud when
needed.
In this thesis we are using software architecture definitions and terminologies from Bass
et al. [1] and Hilliard, R. [2] are used. Software architecture is what is essential or
unifying about a software system: the set of properties which define the software system’s
structure, or form, the behavior, function, value, cost, and risk. Software architecture can
be defined as:
The view held on software architecture in this thesis, joined with the above definition,
is that an architecture is a conception of a system, an abstract form. Software architec-
ture can exist without any documentation or any concrete or physical representation.
Software architecture embodies the essential or key properties about a software system.
The architecture is understood in the context of the software system, not in isolation. To
understand the architecture it is essential to understand the environment and how the
system relates to it. Software architecture is, on the other hand, not an overall physical
structure of the system.
In other words, every software system has an architecture, just as every house, bridge or
airplane, but it does not need to be documented or understood.
The practice and the study of software architecture is concerned with the tools, methods
and ideas to create fundamental system structure. Just like building a house, it is not
necessary to use architectural discipline or build it with any architectural basis, but do-
ing so makes it more likely that the house withstands more external or internal intrusion
like wind, snow, earthquakes, ageing, etc. The same goes for software architecture, it is
a discipline for increasing the quality of the software, helping the developer work with
the requirements of the system and extend the lifetime and overall quality of the system.
Architectural requirements, architectural design, architectural description and architec-
tural evaluation are examples of techniques and activities a software architect uses for
the architectural design process (see figure 2.3) and defines the software architecture.
Architectural Prototyping
Attribute-Driven Design, Architectural
Architectural Patterns, Description
Styles
Architecture Tradeoff
Analysis Method,
Software Architecture
Analysis Method
System
system scale? What is the peak throughput of a software system given certain hardware?
How likely is the software to fail? How easy is it to manage? How can it be used for
people who are disabled? These characteristics are called “quality attributes” and are the
subject of the next section.
Functionality, or the ability of the system to perform the work for which it was intended,
and quality attributes, are closely related but independent from each other. Many archi-
tectural decisions address concerns that are common, driven by the need for the system
to exhibit a certain quality property rather than provide a particular function. Func-
tionality often takes the front seat, and even the only seat, in the development process.
This is short sighted when systems are frequently redesigned, because they are difficult
to maintain, port, or scale or are too slow. Software architecture is the first stage in
software creation where quality requirements can be addressed [1].
16
The choice of function does not for example dictate the level of security, performance,
availability or usability. However, a software architect can choose a desired level of each,
though this is not to say that any level of any quality attribute is achievable, in the sense
that high modifiability often means lower performance, etc. Furthermore, one can not,
for example, define complete scalability of a given system, but one can set a desired scal-
ability level of the system. A quality attribute can be defined as a relative level of quality
to fulfil a set requirement.
System quality attributes have been of interest to the software community since the
1970s. In this thesis the Bass et al. [1] description and characteristics will be used.
Here quality attribute scenarios (QAS) are used to define quality attribute requirements.
Bass et al. define quality attribute scenarios as follows: “a quality attribute scenario is a
quality-attribute-specific requirement” [1]. A quality attribute scenario consists of six
parts:
Source of stimulus. This is who generated the stimulus. This can be some role, such as
developer, a computer system or any other actor.
Stimulus. This is the condition that needs to be considered when the stimulus arrives at
a system.
Artifact. This is the artifact that is stimulated. It can be the system itself or some part of
it.
Environment. This defines the conditions when the stimulus occurs. This might be for
example a running state of the system or a development state.
Response. The response is the activity, or the change, undertaken after the arrival of the
stimulus.
Response measure. This is a measure of the response that the requirement can be tested
against.
Table 2.5 shows an example of a quality attribute scenario simplified from a quality at-
tribute scenario for our prototype. Figure 2.4 shows an example of a modifiability sce-
nario.
17
Bass et al. [1] list six main types of system quality attributes along with examples of
business qualities and architectural qualities. The main system quality attributes types
are:
Availability is concerned with a system failure and its associated consequences. Some as-
pects of systems failures are the frequency the system failure may occur and what
happens then, the amount of time the system may be out of operation, when fail-
ures may occur safely, how failures can be prevented, and what kinds of notifica-
tions are required when a failure occurs.
Note that failure and fault is not the same thing, fault is a system state that is not
observed by the system’s users but if not handled correctly it may become a failure,
which will be observable by the system’s users. A fault might be a lack of storage
18
resources, which might be solved by freeing some disk space, but if it is not handled
it might stop the operation of the software system thus leading to a failure.
Modifiability is concerned with the cost of change to a software system. It can be broken
down as what can change and when, and who makes the change (artifact or envi-
ronment). A change can occur to any aspect of the system (artifact), for example
the functions of the system, its platform or its environment. Change can also hap-
pen at any time (environment), for example a developer may change the source
code or a user may change a language settings on his website.
When a change has been specified, a new implementation must be designed, im-
plemented, tested, and deployed. All of these actions take time and money, both
of which can be measured.
Performance is concerned with timing, e.g. with how long it takes the system to respond
when an event occurs. A performance scenario begins with a request for some
service arriving at the system. Satisfying the request requires resources to be con-
sumed. While this is happening, the system may be simultaneously servicing other
requests. An example of a performance scenario is: “A user starts 10 requests a
minute under normal operation, and each operation takes less then a second to
respond.”
Security is concerned with the systems ability to serve its own users while denying any
unauthorized usage. An attempt to breach security is called an attack and can
be an unauthorized attempt to access data, modify data, or deny service to le-
gitimate users. A secure system can be characterized as a system providing non-
repudiation, confidentiality, integrity, assurance, availability, and auditing [1].
Testability is concerned with the ease with which software can be made to demonstrate
its faults or correctness. Testability refers to the probability, assuming that the
software has at least one fault, that the software will fail on its next test execu-
tion [1].
Usability is concerned with how easy it is for the user to accomplish a desired task and
how the system supports its users. It can be broken down into five areas: learning
system features, using a system efficiently, minimizing the impact of errors, adapt-
ing the system to user needs and increasing users confidence and satisfaction.
19
Along with these main types of quality attributes scenarios, Bass et al. [1] mention cus-
tom systems quality attributes, such as scalability, portability, and interoperability. They
also define a generic QAS for each quality attribute. For each use of these QAS the user
has to fill out the six parts of the scenario generation framework, that is source, stimulus,
environment, artefact, response, and response measure. The authors also include busi-
ness qualities (cost, schedule, market, and marketing considerations) and architectural
qualities (conceptual integrity, correctness and completeness and build-ability) which
we will not discuss further.
The above mentioned qualities can be a good basis when designing software architec-
ture and working on the architectural description of the software architecture. To create
quality attribute scenarios, as seen in table 2.5, a software architect needs to think about
the software architecture with these quality attributes in mind. These quality attributes
were created to include common quality use cases and to be extended to new and cus-
tomized quality attributes.
Module viewpoint is concerned with how the functionality is mapped to the units of
implementation. It visualizes the static view of the systems architecture by showing the
elements that comprise the system and their relationships. A module view contains mod-
ules with their interfaces and their relations. Here a module is a code or implementation
unit, including packages, classes and interfaces in Java. Its relations include associations,
generalizations, realizations and dependencies.
21
Module Viewpoint example The examples in this section will use an imaginary web mes-
senger software system. In this system, users can identify themselves and start or join
a chat session “a chat room”. There they can see everyone joined in the chat and send
messages to the chat session. When a new message arrives at the chat session the client
pulls it from the server. The module view of the web messenger can be described by us-
ing the class diagrams of UML, describing the system top down by starting with the a
top-level diagram and ending at the class or interface level. Figure 2.6 shows a package
diagram for the web messenger software and figure 2.7 displays a class diagram for the
messenger logic package.
WebMessenger
Messenger
User Interface
Logic
Data Manager
Figure 2.6.: Model View example, package overview of the web messenger software
system
Component and Connectors viewpoint (C&C) is concerned with the runtime function-
ality of the system. In other words, what does the system do? In this viewpoint, the
system’s software consists of components and connectors, where components are units
of functionality which define what parts of the system are responsible for which function-
ality and connectors which are communication and coordination relationships between
components and define how components exchange control and data.
The properties of both the components and the connectors in the architectural descrip-
tions will be described below. This is done with both written explanations and diagrams,
showing protocols, state transitions, threading, concurrency issues or what is relevant to
the architecture at hand.
22
Messenger Logic
ChatSession
Message
sessionName
* 1
from sessionId
time
body submitMessage()
getSessionMessages()
*
User
nickname *
fullName
email
ipAddress
Figure 2.7.: Module View example, decomposition of the Messenger Logic Package
C&C Viewpoint example The web messenger has four major functional parts, as shown
in figure 2.8. Components are represented by UML active objects and connectors by
links with association names and possibly role names. The diagram in figure 2.8 cannot
stand alone, as component and connector names are only indicative of the functional
responsibilities related to each. A description of a component’s functionalities in term of
responsibilities should therefore be provided:
• User Interface is responsible for 1) rendering the presentation of the user inter-
face, 2) managing requests from the user, mainly submitting new messages and
update all session messages, 3) handling new messages and make sure they are
representable with the Messenger Logic, 4) rendering all chat messages.
Just as the components, the connectors also need to be described in more detail. The
level of detail needed depends on the architecture at hand. For some connectors, it may
be sufficient to have short textual description, but for others it may be best to explain
them by UML sequence diagrams. Our Messenger application has three connectors:
• AJAX. Asynchronous JavaScript and XML, is a web standard for making clients
communicate with servers.
• JDBC is the connector that handles standard SQL queries with the JDBC protocol.
Sequence diagrams can be used either to describe the connectors protocol individually or
to provide the “big picture” showing interaction over a set of connectors. In our example
an overall sequence diagram describes the big picture, see figure 2.9.
user
Browser AJAX :User Interface
view/
controller
MVC
model
server :Messenger
DataBase JDBC Logic
Allocation viewpoint is concerned with how the software elements of the software sys-
tem are mapped to platform elements in the environment of the system.
SubmitMessage(msg)
submitMessage(msg,user,session)
storeUsersMessage(msg,user,session)
Message is stored.
getNewMessages()
getSessionMessages(session)
getSessionMessages(session)
session_messages
session_messsages
Figure 2.9.: C&C view example, sequence diagram, web based messenger
The deployment viewpoint has two element types, software elements and environment
elements, and three relation types, allocated-to relations, dependencies among software
elements and protocol links among environmental elements showing the communica-
tion protocol used between nodes [1].
Deployment Viewpoint example Figure 2.10 shows the deployment view of the web mes-
senger software system using a UML deployment diagram. The deployment is a three-
tier deployment, where presentation is to run on the client, domain code to run on a Java
application server, and data is stored on a database server.
– The Browser is the input and final output for the messages.
– The Application Server is the machine serving the UI through a web server
and serving all other application level functionality.
– The Browser displays the client-side presentation and runs client side scripts
and interacts with the User Interface via AJAX.
– The User Interface renders messages and the presentation layer and delivers
25
– The Messenger Logic keeps track of chat sessions, users participating in these
sessions and messages associated. It interacts with the Data Manager for
secondary permanent storage.
– The Data Manager takes messages, sessions and users and sends them to a
relational database and retrieves sessions from the database.
– MySQL is an open source SQL database which handles database related func-
tionality of the system.
User 1
Application Server
Browser
User
Interface
User 2 Messenger
Logic
Browser
Data
Manager
Database Server
MySQL
Having made and reviewed an architectural description, the next logical step might be
to create a architectural prototype, which will be the background of the next section.
So far a few theoretical techniques a software architect has at his disposal have been pre-
sented, next an experimental technique will be reviewed, namely architectural prototyp-
ing. Examples of other experimental techniques, that will not be further described, in-
clude simulation and scenario-based methods with explicit stakeholder involvement [24].
26
Bardam et al. [3] also point out a number of architectural prototyping characteristics as
they define an ontology of architectural prototyping, which relates to their definition of
architectural prototyping, see figure 2.11. Below these characteristics will be discussed
along with architectural prototyping in general.
Architectural prototypes can be classified into three general types: exploratory, exper-
imental and evolutionary. Exploratory architectural prototypes are created to explore
the architecture design space, multiple prototypes are usually created, analyzed and ex-
ecuted in order to find a solution to a given problem. Experimental architectural proto-
types are created to evaluate a specific architectural decision, a single prototype is usually
created and evaluated. Evolutionary architectural prototypes are created as a series of
prototypes, where each prototype is built as revision of the last one.
27
Software changes range from modifying a tiny part of the system, like a line in a con-
figuration file, to changing the whole system, but these changes can be categorized into
three types [1]: local, non-local and architectural. A local change is usually a very small
change and can be accomplished by modifying a single element. A non local change can
be bigger and requires multiple element modifications, but leaves the underlying archi-
tectural structure intact. An architectural change affects the fundamental structure of
the system (the elements interact with each other and will likely require change all over
the system, this therefore changes the pattern of the architecture).
Architectural change is about the ability to model and analyze the change of software
architecture, and our interest mainly is in runtime architectural change. That is, how
does a developer change an architecture on a running software system. This change
is the key to building autonomic systems [4]. In cloud computing, scalability is a big
factor in running software systems. These systems need to be able to run on multiple
computing instances, they need to be able to scale instantly and need to be able to modify
themselves to adapt to rapidly changing environments. It can be extremely hard to build
28
programs to handle these requirements with a static and solid architecture, which brings
architectural change in as a favorable factor to scalability of a cloud computing software
architecture.
ware architecture. Architectural scripting is a way to model the dynamic aspects of run-
29
Device
A device is a physical or virtual (VM or JVM in Java) device.
Component
A component is a unit of deployment. This can be a package of executable code
with explicit dependencies, usually a binary package. Components can export
modules that other components may require. Components are deployed to devices
and provide services, which can be an instantiated form of the component.
Module
A module is a typed library, a class, an API etc.
Service
A service is a typed unit of instanced software, a running instance of component,
with explicit dependencies in the form of interfaces and explicit capabilities in the
form of provided interfaces.
Interface
An interface is a typed unit of association between services. In many cases imple-
mented as an object reference (required) or as an object (provided)
Connector
A connector is a way to expose a method to show an interface or a service.
An architectural operation is a unit of change for each of these parts. For example an
updated module, new exposed service, added component, etc. The Architectural Script-
ing Language is a scripting language that defines these operations and allows multiple
and sequential change on the software architecture. In the Hydra project [8] examples
of operations are: deploying a component onto a device, deploy_component(component,
device); starting a component’s service on a device, start_service(service, component, de-
vice); and starting a device start_device(device).
Architectural operations are explained below with the example of the Web Messenger:
From the allocation viewpoint, we can think of the environmental element Application
Server as a device.
30
The software elements from the allocation viewpoint, User Interface, Messenger Logic
and Data Manager would be presented as Components
Libraries that the Web Messenger uses, such as MySQL JDBC connector, are examples
of modules.
The components expose services. For example (from the C&C viewpoint sequence dia-
gram) the messenger logic might have a public interface for the UI to communi-
cate with (with functions like: sendMessage() and getMessages() ). The instance
of that class would be exposed as a service.
Now if one would want to update the MessengerLogic component and start it as a service,
one could do so with an ASL script:
1 update_component ( A p p l i c a t i o n S e r v e r , MessengerLogic )
2 s t a r t _ s e r v i c e ( M e s s e n g e r L o g i c S e r v i c e , MessengerLogic , A p p l i c a t i o n S e r v e r )
This script would 1) update the MessengerLogic component on the device Application-
Server, assuming that it knows where the updated version is and 2) start the service Mes-
sengerLogicService which the MessengerLogic component exposes on the device Appli-
cationServer. As can be seen, although this is a serious architectural change, deploying
this kind of change is easy and given that the ASL implementation is correct, the opera-
tions taken can be tried and proven with mathematical measures. The other components
should not be directly affected by this change and therefore the downtime of the software
is minimal. Modelling runtime change to an architecture with the notion of a script has
several advantages [4]:
• A script has its limitation, i.e. an ASL script cannot be used to configure devices,
modules, or services, e.g. by setting web module to use port 443 for SSL.
• An ASL device can hardly fully encapsulate the device it represents, e.g. does the
device have a touch screen? Is the required kernel installed? Does it have firewall
configured? To make an ASL script fully encapsulate the scripting language and
interpreter needs to know the most aspects of the device, such as screen resolution
and type, keyboard buttons available, kernel modules available, etc. No imple-
mentation of ASL can include all possibilities of device types and setups, but given
limited and fixed set of attributes of devices this can however be done.
3. Cloud Computing and Architectural
Operations
After having viewed the background of cloud computing and architectural operations,
it can be argued that the Architectural Scripting Language can provide advantages for
runtime management of cloud computing software systems architecture. As a benefit
of cloud computing scalability, cloud applications need to be changeable to adapt to the
changes of the environment. The Architectural Scripting Language can help with that
adaptation with architectural change on runtime software. Thus it is concluded that
Architectural Scripting Language in cloud computing is a combination worthy of further
investigation. Below our design and implementation of a Cloud Architectural Scripting
Language (Cloud ASL) will be described, in addition to related work done in that field.
The Architectural Scripting Language and cloud computing could complement each
other and the results of the joined forces might result in scalable cloud computing soft-
ware. However, ASL is not designed for cloud computing, which provides the opportu-
nity to design and developing on ASL implementation specified for our needs for cloud
computing infrastructure. This implementation will hereafter be called Cloud ASL. The
aim is to use the IaaS model for our implementation. The other service models do not
apply as well to our problem, since they do not offer the same freedom to use languages
and libraries and other technology as IaaS does. Other service models in contrast offer
more and simpler scalability options, and do not require knowledge of operating systems
and distribution.
For designing Cloud ASL architecture and operations we can review section 2.3.2 and
see some modifications and simplifications that are possible, for example removing con-
nectors and modules from the ontology, leaving devices, components, services and in-
terfaces.
33
34
deployed to
Component Device
*
requires/
provides
requires/
provides
Service Interface
*
The Cloud ASL operations are the main way to interact with Cloud ASL. The opera-
tions needed to model the architectural change on cloud computing architecture can be
limited to three types of operations: device, component and service operations.
35
Devices
Operation Description
device:create_instance_device( type ) Create a new cloud instance of a certain type.
This instance will hold the VM.
device:create_vm_device( device ) Creates a new VM on an already running de-
vice/instance. This can be useful for running
multiple programs on the same instance.
boolean:start_device( device ) Start a stopped device.
boolean:stop_device( device ) Stop a running instance.
boolean:destroy_device( device ) Shuts down and destroys cloud instance. This
action is irreversible and all information on
that device will be lost. If multiple VMs are
running on this device they will be lost.
device:clone_device( device ) Creates a direct replica of a device.
Components
Operation Description
component:install_component( url, de- This installs a component, from a given URL
vice ) to a given device
boolean:uninstall_component( compo- Uninstalls a given component from a given de-
nent, device ) vice
boolean:start_component( component, Start a given component on a given device
device )
boolean:stop_component( component, Stops a given component on given device
device )
boolean:update_component( compo- Updates or upgrades a given component, ona
nent, url, device ) given device with component from a given
URL
Services
Operation Description
service:register_service( component, de- Register a new service from a component on a
vice ) device
boolean:unregister_service( service, Unregister a service from a component on a de-
component, device ) vice
boolean:enable_service( service, compo- Start a stopped service
nent, device )
boolean:disable_service( service, compo- Stop a running service
nent, device )
As an example of Cloud ASL script, a deployment of the web messenger previously intro-
duced will be used. As seen in section 2.2.2, the example of the web messenger consists
of three deployment modules, or components from now on, User Interface (ui.jar), Mes-
senger Logic (mess.jar) and DataConnector (db.jar). These components all run on the
same server, or device from now on.
In this example we assume that the component JAR files are located on some location,
http://URI. After running this script the web messenger should be running on the device
“server”. Nothing has to be deployed to the user device as all functionality needed for the
user comes from the user’s browser.
37
For our implementation of Cloud ASL more implementation decisions had to be made.
The first decision was the programming language of use, because other aspects, such
as frameworks and the Cloud ASL language specifications, might depend on it. When
choosing the programming language, prior knowledge, platform independence, modu-
larity, performance, framework availability, and support had to be kept in mind. Java1
was chosen as the main programming language. Java is an cross-platform, object-oriented,
VM-based programming language, which is easily modular with the use of jar files to
distribute Java applications or libraries, in the form of classes and associated metadata
and resources (text, images, etc.). OSGi2 is an modular framework specification for Java
which suites our need to abstract Java modules (JAR) and the Java Virtual Machine.
The author of this thesis has experience of it making development and implementation
faster. As our OSGi implementation Apache Felix OSGi3 was used.
Amazon EC2 was chosen as our IaaS cloud computing service provider due to many
factors, see section 3.3. Amazon EC2 and Amazon AWS is the cloud industry’s lead-
ing cloud provider for IaaS and has a large and active community. Amazon AWS has
database and queueing services that could be used directly, and a complete supported
Java client for their API’s. For this project Amazon AWS provided generous funding to
use for development and testing.
For our Cloud ASL language implementation it had to be decided whether this domain
specific language should be internal or external, and if external should it be Turing com-
plete? The difference of internal or external DSL has been described by M. Fowler [29]:
“External DSLs are written in a different language than the main (host) lan-
guage of the application and are transformed into it using some form of com-
piler or interpreter. The Unix little languages, active data models, and XML
configuration files all fall into this category. Internal DSLs morph the host
language into a DSL itself - the Lisp tradition is the best example of this.”
To make an external domain specific language (external DSL) we would need to create
it from scratch. We could, for example, build it in XML, or HTML, making it easily
viewable, or we could use a simple scripting language. Because making the language an
internal DSL could give us a Turing complete language with possible less effort it was
decided that we direct our work there. By using an internal DSL it was possible to use
or extend an existing language implementation, which would be Turing complete and
1
http://www.java.com/
2
The OSGi framework is a module system and service platform for the Java programming language that
implements a complete and dynamic component model. http://www.osgi.org/
3
Apache Felix is a community effort to implement the OSGi R4 Service Platform and other interesting
OSGi-related technologies under the Apache license. http://felix.apache.org/
38
would include the features that were needed. As using the Java programming language
had been decided, there were many possibilities of implementations, such as importing
Java code as a script using on embedded compiler, like Janinio4 or importing Java classes
dynamically. A Java embedded scripting language was chosen for this project. This en-
ables us to use the full API of Java, interaction to and from Cloud ASL to the software
using it and importing scripts into a running JVM. After comparing Java scripting lan-
guages, such as Groovy, JRuby, Scala, Clojure and Jython, Groovy was chosen to be our
scripting language. Groovy has gained a lot of momentum in the Java world and is now
supported in most Java IDEs and there have been a number of frameworks made for
Groovy, such as Grails5 . The Groovy language is a subset of the Java language and sup-
ports the Java Language natively just as the Groovy Language. The Groovy and Java
languages can also be used together. Groovy’s features are similar to those of Python,
Ruby, Perl, and Smalltalk, making it easy for many to use. Groovy has native support for
importing Groovy scripts into a Java program, which makes it ideal for our use.
As we have chosen a programming language and framework, we can map them to our
Cloud ASL ontology:
Devices
Devices are mapped to a JVM running on a cloud instance. There can be multiple JVMs
running on each cloud instance, therefore multiple devices on each instance. The cloud
computing infrastructure used here is Amazon’s EC2 and we interact with it, with the
device operations, through the cloud service API, or Amazon AWS SDK. Details of our
cloud implementation is the subject of section 3.3.
Components
In the OSGi framework, software is modularized into so called bundles. Each bundle
should be focused on specific functionalities, just like a normal Java JAR file is. We
associated a component to a single OSGi bundle, which is deployed to a single running
OSGi framework, on a JVM, on a single device/instance. This component can provide a
service through an interface. The Cloud ASL operations use Telnet as a basis to interact
with the OSGi framework on a device, which gives control over the framework’s API.
Services
Like previously stated, services are instances of components and are registered as in-
4
Janino is a compiler that reads a Java expression, block, class body, source file or a set of source files,
and generates Java byte-code that is loaded and executed directly. http://www.janinio.org
5
Grails, former Groovy on rails. http://grails.org
39
terfaces. We use OSGi’s declarative services6 for our Cloud ASL implementation and
needs, which is responsible for starting, stopping, and registering services as needed.
This makes Cloud ASL operations of services unnecessary and are therefore not imple-
mented in our version.
Interfaces
An Interface is a typed unit of association between services. A component provides a
service through an interface. Since components require the interface of a service to be
available on compile time, creation and registration of interfaces can not be as dynamic
as devices and components. We implemented the Cloud ASL interfaces as an special
bundle, which was installed on all devices, but this can be done in multiple ways.
Module
Our implementation relies on bundling libraries, APIs or other modules as or in com-
ponents, so we can leave the module concept from our implementation.
Connector
The OSGi declarative services is used to connect services internally, which takes care of
all internal service connections.
Where we are not implementing operations for services, interfaces, models or connec-
tors we need to implement operations for devices and components.
Devices
Table 3.4 does not include start_instance(), stop_instance() and clone_device(). As
Amazon EC2 does not support stopping running instances existing in the “instance store
(S3)” compared to “EBS”, which is the way we started, starting and stopping of instances
was left out. The instance store (S3, Simple Storage Service) is the original way to store
images of virtual machines but EBS (Elastic Block Storage) is a more recent way to store
images and data. Cloning instances is supported by EBS. Therefore, instead of imple-
menting a complex cloning mechanism, that functionality was left out for it to be sup-
6
OSGi Declarative Services allow automatic registration, activation and deactivation of OSGi services.
http://felix.apache.org/site/apache-felix-service-component-runtime.html
40
Method Summary
Device create_instance_device(String type)
This creates a new cloud instance of a certain type, for example “m1.small” for
a regular small Amazon EC2 instance type. Next it installs JVM and OSGi
onto that instance and starts it.
Device create_jvm_device(String type, Device device)
This creates a new JVM on an already running device/instance and starts
OSGi on it. This can be useful for running multiple programs on the same
instance.
void restart_device(Device device)
This restarts the device, which means stops the OSGi framework, reboots the
cloud instance and starts the OSGi framework again. The OSGi framework
is cached and therefore is started in the same context as it was when shut
down.
void destroy_device(Device device)
This stops the OSGi framework and shuts down the cloud instance. This
action is irreversible and all information on that device will get lost.
Device[] get_devices()
This returns array of currently running devices.
Table 3.4.: List of implemented device ASL operations from table 3.1
ported when changing the implementation to support EBS. This limitation is further
discussed in chapter 5.
41
Components
Method Summary
Component install_component( Device device, String URI )
This installs an OSGi bundle from a given URI into given device OSGi frame-
work.
void uninstall_component(Component component)
This uninstalls an OSGi bundle from a OSGi framework.
void start_component(Component component)
This starts an OSGi bundle, by default bundles are not started automatically
when installed.
void stop_component(Component component)
This stops a OSGi bundle.
void update_component(Component component, String componentURI)
this updates/upgrades a bundle with a version from given URI.
Component[] get_components( Device device )
This returns an array of components installed on a given device.
Table 3.5.: List of implemented component ASL operations from table 3.2
Although the architecture of Cloud ASL being designed before implementation, it had
to be changed as the limitations and restrictions in our external environment were dis-
covered. Below the final architecture description will be shown using the methods in
section 2.2.2.
Component and Connectors Viewpoint Cloud ASL has five major functional parts as
shown in the diagram in figure 3.2.
42
:CloudASL
dest
source SOAP :Cloud API
:CloudASL
source
Java
:CloudDevice
dest
source
Java
:Device
SSH :OSGi
source dest Component
dest
source
:Component Telnet
dest
• Cloud ASL. Responsible for interacting with other modules and the main interface
to Cloud ASL. Cloud ASL includes all functions of the Cloud ASL API and should
be a sufficient way for any external module to interact with Cloud ASL. Cloud ASL
scripts can be passed to this component, which parses the script and runs as a
Groovy script with the Cloud ASL framework.
• Device. Responsible for 1) managing devices through Cloud ASL’s device methods
and 2) interacting with cloud API’s through the Cloud component.
• Cloud API. Responsible for managing cloud instances. This is where all our cloud
functionality lies, currently our EC2 interactions and API. To add or switch cloud
providers only this part needs to be modified. This could be, for example, by adding
a REST or SOAP connector to another cloud platform.
• SOAP or Simple Object Access Protocol is a web service protocol based on XML
messages through HTTP messaging.
• SSH or Secure Shell is a network protocol that allows data to be exchanged using
a secure channel between two networked devices.
43
• Java method calls are used to communicate between classes, components and li-
braries.
In figure 3.3 a sequence diagram displays the scenario of starting a device, install and
run a component, uninstall the component and then destroy the device.
Module View The module structure of our architecture will be shown by the UML di-
agrams below, starting with the package overview of Cloud ASL in figure 3.4. The ASL
package and its dependencies will be further described in figure 3.5, the device pack-
age in figure 3.6, the component package in figure 3.7, the cloud package in figure 3.8,
the ASL package in figure 3.5, and at last the OSGi package in figure 3.9. Dependen-
cies among packages are also shown; these dependencies arise because of relationships
among classes in different packages.
44
ASL
ASL
ASL
ASL
DevicesList
create_instance_device(String)
create_jvm_device(String, Device)
clone_device(Device)
start_device(Device)
stop_device(Device)
destroy_device(Device)
getDeviceByIP(String)
getDeviceById(String)
getDevices()
install_component(Device, String)
uninstall_component(Component)
start_component(Component)
stop_component(Component)
update_component(Component,
String)
runScriptString(String)
runScriptUrl(String)
Device Component
Device
Device
create_instance(String)
create_jvm(String)
clone()
destroy()
start()
stop()
get_devices()
Cloud
Component
Component
install(Device, String)
uninstall()
update(String)
start()
stop()
OSGi
Cloud
Cloud
runningInstances CloudInstance
1
* FingerPrint
createKeyPair(String)
createInstance(String) Id
destroy(Device) Name
runScript(CloudInstance, String) PrivDNS
PrivIP
PubDNS
PubIP
ReservationId
Type
publicKeyName
state
EC2
stateCode
copyScript(CloudInstance, String, String) user
createKeyPair(String)
createInstance(String)
destroy(Device)
runScript(CloudInstance, String)
SimpleDB
getInstance(CloudInstance)
registerInstance(CloudInstance)
updateInstance(CloudInstance)
OSGi
Connector
InstallBundle(Device, String)
start()
stop()
uninstall()
update(String)
TelnetConnector
device
bundleId
InstallBundle(Device, String)
start()
stop()
uninstall()
update(String)
Deployment View Figure 3.10 demonstrates the deployment view of Cloud ASL using
a UML deployment diagram. The deployment diagram shows Cloud ASL to be running
on one server and the OSGi components running on different servers. However, in re-
ality, they could be running on the same server, and multiple OSGi instances could be
running on a single server.
– Cloud ASL server is the device running the Cloud ASL code. This can be one
of the Cloud ASL nodes.
– The Cloud ASL device is a device or devices Cloud ASL is interacting with and
hosts Cloud ASL components.
48
– The Cloud ASL is an executable code that manages Cloud ASL devices. It
communicates with the Cloud through SOAP, the Cloud ASL device through
SSH and the OSGi through telnet.
– Cloud ASL device. This is the cloud device that hosts OSGi.
– OSGi. This is the OSGi framework which hosts Cloud ASL components.
– Cloud API. This is the web service that operates the cloud instances and is
responsible for creating, destroying, starting and stopping cloud devices.
:CloudASL server
Cloud API
CloudASL
:CloudASL device
OSGi
The web messenger example will be used again in this section, but because our imple-
mentation requires that interfaces be stored in a special library component, that com-
ponent will be added to this example. In listing 3.2 we create a device, install and start
required components to run our web messenger example.
/ / c r e a t e s a s m a l l i n s t a n c e wi t h t h e v a r i a b l e name ‘ ‘ s e r v e r ’ ’
2 Device s e r v e r = c r e a t e _ i n s t a n c e _ d e v i c e ( ”m1 . s m a l l ” )
/ / where our i m p l e m e n t a t i o n d e v i c e s s t a r t on c r e a t i o n we do n o t need t o
start it .
4 / / n e x t we i n s t a l l our components on t h e s e r v e r
Component libCmp = i n s t a l l _ c o m p o n e n t ( s e r v e r , ” h t t p : / / URI/ l i b r a r y . j a r ” )
6 Component uiCmp = i n s t a l l _ c o m p o n e n t ( s e r v e r , ” h t t p : / / URI/ u i . j a r ” )
Component messCmp = i n s t a l l _ c o m p o n e n t ( s e r v e r , ” h t t p : / / URI/ mess . j a r ” )
8 Component dbCmp = i n s t a l l _ c o m p o n e n t ( s e r v e r , ” h t t p : / / URI/ db . j a r ” )
49
When thinking about binding ASL to clouds, some decisions had to be made. First is
the type of cloud computing to use. The choice was between IaaS or PaaS, but because
Google App Engine is the only PaaS provider that supports Java and it does not support
threading nor our basic model of interchangeable executable components, IaaS was cho-
sen. For IaaS models, private and public clouds could be used, but as there was no ac-
cess to community or hybrid clouds, they were out of question. At the beginning of this
project, access was granted to hardware from the University of Iceland and it was decided
to set up our own private cloud based on the Eucalyptus cloud computing infrastructure
software.
3.3.1. Eucalyptus
In early October 2009 work began on setting up the Eucalyptus open source cloud com-
puting infrastructure on the University of Iceland computer grid. However, due to many
unforeseen problems this work was not successful. At first we had problems getting the
Eucalyptus infrastructure to work with the networking environment, mainly the main
DHCP server at the University of Iceland. When this problem was solved we could not
carry on due to lack of access to administrators. After unsuccessful attempts at getting
access to hardware from other sources we decided to use a public infrastructure.
not to be mature and functional enough for our use and therefore Amazons SDK was
chosen. Amazon granted us $100 funding to use with the Amazon Web Services, which
gave us opportunity to use the Amazon cloud as we needed.
First an Amazon Instance Image (AMI) was created, which is a virtual image containing
the operating system (32 bit Ubuntu 9.10) partition, a swap partition and data partition,
and the AMI was stored on Amazon’s instance store. On that virtual image an Open-JDK
was installed, which is an open source community-maintained Java implementation and
Apache Felix, a OSGi implementation and necessary OSGi bundles/ASL components to
start with the ASL implementation.
The cloud ASL implementation in our case implements cloud operations in the following
way:
start_jvm_device requires a running device with OSGi setup. We connect to that ma-
chine via JSCH and copy the already running implementation of OSGi to a new
location and start the OSGi framework from there.
start_device and stop_device : As Amazon’s EC2 does not enable stopping or starting
instances stored in the instance store, these operations were skipped. If the AMI
would be transferred to the Amazon’s EBS these features could be easily enabled.
When Cloud ASL is defined and set up it has to be possible to test it in a real world
scenario. Therefore a software system had to be created, capable of running on a cloud
computing infrastructure that has something to gain from Cloud ASL. That is, the soft-
ware system has to be able to run on multiple computing instances at the same time and
8
JSch is a pure Java implementation of SSH2, allowing connecting and file transfer to an sshd server
with Java programs. http://www.jcraft.com/jsch/
51
needs to be scalable. For this work the architectural design process from figure 2.3 is
used. Each part of that figure is a topic of a sub section where individual design steps are
show for our software. Thus ending up with an architectural prototype. This prototype
or software was named “The Turnip” and that name will be used from now on.
As the diagram in figure 2.3 displays, the first step is to gather architectural require-
ments, see section 3.4.1, which are then evaluated and become the basis of architectural
design, section 3.4.2. From these designs architectural descriptions are created, 3.4.3.
The evaluation is a part of chapter 4. The resulting system is the topic of section 3.4.4
Functional Requirements
As the main function of our software prototype have not been revealed, the starting point
will be focusing on underlying requirements that Cloud ASL and our project require.
The approach of Christensen et al. [22] will be used, by creating architecturally sig-
nificant scenarios containing a subset of the overall scenarios providing the functional
requirements for the system.
In short these requirements are: Cloud ASL support, cloud connectivity and modularity.
As a scenario for our experiment it was decided to implement a distributed ray tracing
application. One which has the advantage of being time and computing intensive, is vi-
sual in the case that a ray tracing job results in a resulting image, can easily be broken
down from one big job into multiple smaller jobs, and might interest third-party indi-
viduals. The disadvantages of using ray tracing are that there are not many active open
source implementations, ray tracing is a complex process and might be hard to distribute
and individuals that have little knowledge of computer graphics might not be familiar
with the concept.
Use Cases
As a user I can create ray-tracing jobs and choose on how many computing instances
they run at a time, and when a job is finished I can save the results of the ray tracing
job.
As a user I can shut down running computing instances after I have finished working
with them.
As a user I would like to be able to monitor the result of currently running ray tracing
job.
Quality requirements
Our architectural requirements will be described in this section, with a set of significant
quality attribute scenarios. The goal of describing quality requirements is to help with
construction of “test cases”, where architectural quality attributes may be compared and
evaluated, see section 2.2.1.
2. Scalability. The system shall be able to perform satisfactorily from one to hundreds
of computing instances and shall be able to scale both down and up.
Here, a quality attribute will be used that is not part the main quality attributes of Bass et
al. [1]. Our definition on a scalability scenario generation framework is shown as Bass
et al. generic scenario:
53
Scenario Parts
power
Stimulus: The request of modifying system capacity, for
example increasing the number of supported
users.
Artifact Computing instances, computing monitoring
framework or something similar.
Environment: Runtime or development for pre-emptive mea-
sures.
Response: What has been done to meet with the change
of scalability, for example there have been more
cloud computing instances deployed.
Response Increased/reduced number of computing in-
Measure: stances, more/less disk space given or in-
creased network throughput.
Here availability, security, testability and usability are left out. The reasons why those
are not thought to be a main quality attribute requirement in our case are:
Testability Making a distributed and cloud enabled architectural prototype which is run-
ning a foreign domain specific language, testable in a execution-based environ-
ment could be a whole new thesis on it is own, and therefore not in the scope of
this project.
Usability As the goal of this experiment is not to create an easy to use application, but
to create an case study to use our Cloud ASL implementation, Usability was not
defined as a required quality attribute.
Now that the requirements are established, a better look can be taken at the architectural
design and decisions made regarding the design. Architectural design can be done as an
architectural prototype, attribute-driven design, architectural patterns and styles.
For architectural prototype decisions several choices are available, see section 2.2.3. The
prototype is the experimental type, which means that only one prototype had to be made
to evaluate whether our architectural decisions are valid. The characteristics of the pro-
totype aimed for are exploratory and learning and quality attributes. For exploration
and learning the goal is to learn about how our prototype works for Cloud ASL and how
architecture can or should be made. As quality attributes are big motivations of this
prototype and help us with attribute-driven design, this is a characteristic that can help.
The architectural pattern we are using is the client and server pattern.
For the architectural description module viewpoint, component and connector view-
point and allocation viewpoint were used as referenced in Christensen et al. [22]. We
start by describing the C&C viewpoint.
Component and Connectors Viewpoint The experiment has five major functional and
three assisting functional parts, or components, as shown in the C&C view in figure 3.11,
which are presented as UML active objects. Connectors are presented by links with as-
sociation names and possibly role names. As this figure does not explain the C&C view
on its own, descriptions of components functionalities are provided in term of responsi-
bilities:
• User Interface is responsible for 1) giving the user available functionalities and 2) giv-
ing the user information about the status of rendering jobs.
• Request Manager is responsible for 1) managing requests from the user; 2) working
as a server in the software layout and being the communication layer between other
critical components where needed and 3) being the front-end for the user interface.
55
• ASL is responsible 1) for creating and destroying cloud computing instances (de-
vices) through Cloud API and 2) installing, starting, stopping, updating, stopping
and uninstalling components on devices. This is Cloud ASL from section 3.2.2
• Cloud API is responsible for 1) being the interface and connector to the cloud;
2) Creating and destroying cloud instances and 3) being as independent against
specific cloud provider as possible. This is the package Cloud from section 3.2.2
• Sunflow is responsible for 1) rendering images and 2) exposing basic interfaces and
methods for the rendering process.
• Web Browser is responsible for 1) presenting the User Interface and 2) delivering
the functionalities needed for the user interface to communicate with the request
manager.
• AJAX Asynchronous JavaScript and XML, is a web standard for making clients
(web browsers) communicate with servers.
• SSH or Secure Shell is a network protocol that allows data to be exchanged using
a secure channel between two networked devices.
• Java method calls are used to communicate between classes, components and li-
braries.
56
:Web Browser
Client
AJAX
:Manager Server
:User Interface
Java
source
:Request
:Worker Factory dest Library
source Manager
library
dest server source
Java :SunFlow
:ASL
r-OSGi library
source source
client :Worker
:ASL
SSH Library
client
dest Telnet
r-OSGi
Java :Worker
server
source
:Cloud
The interaction between a set of connectors is not revealed in a single sequence diagram.
Therefore a set of examples is presented, that show the interaction between connectors
in concrete contexts. In figure 3.12 an example of worker creation is shown and in figure
3.13 the rendering process is displayed.
57
addWorker
addWorker
create_instance_device()
create_device()
SSH:run_post_startup_script
SSH:start_osgi
install_component(worker)
Telnet:install worker bundle
install_component(library)
Telnet:install library bundle
start_component(worker)
Telnet:start worker bundle
start_component(library)
Telnet:start library bundle
register worker service
AnnounceRequestManger()
register RequestManager Service
IndexServlet RequestManager WorkerThread Worker SunFlowAPI Scene BucketRenderer BucketRender RemoteDisplay ImagePanel
getWorkerThreads()
setScene(...)
run()
render(..., this, ...)
render(..., worker, remoteDisplay)
render(remoteDisplay)
bucket = nextBucket()
renderBucket(remoteDisplay, bucket, ...)
imageUpdate(..., data, ...)
imageUpdate(..., data, ...)
imageUpdate(..., data, ...)
imageEnd()
IndexServlet RequestManager WorkerThread Worker SunFlowAPI Scene BucketRenderer BucketRender RemoteDisplay ImagePanel
Module View. The module view of our architecture will be described from top-down
starting with the most top-level diagram, beginning with the package overview on figure
3.14. Then, a better look at individual packages will be taken, starting with the user
interface, figure 3.15; the worker factory, figure 3.16; the request manager, figure 3.17;
Cloud ASL, figure 3.18 and ending on the worker, figure 3.19.
Turnip
UI
Request
Worker Factory Library
Manager
CloudASL Worker
Web
HttpActivator
AJAXServlet
setServlets
setRequestManager
unsetRequestManager
IndexServlet
Worker Factory
«interface»
WorkerFactory
createWorker
registerWorker
terminateWorker
RequestManager
«interaface»
RequestManager
addWorker
announceExeption
getImageUpdate
getNextBucket
getResults
getSunflowLog
getSunflowStatus
getSunflowStatusPercentage
getWorkers
makeRequest
registerWorker
ASL
ASL
Cloud
Worker
«interface»
Worker
announceRequestManager
getId
setId
getWorkerId
setScene
run
stop
Deployment View Figure 3.20 shows the deployment of the architectural prototype.
The deployment is based on the client-server architectural pattern. The following el-
ements are of interest:
– The Manager is the server in the client-server pattern and acts as job dis-
tributer for the software.
62
– The Worker is the client in our client-server pattern and acts as a job processor
for the software.
– The Request manager is the server in the client-server pattern and acts as job
distributer for the software.
– ASL. See figure 3.18. This is our ASL implementation from section 3.2.
The experiment ended with a working prototype, that got the working name “the Turnip”.
The Turnip is a distributed ray-tracing software built for cloud computing usage. It is
made to be used with Cloud ASL and utilizes it for cloud operations, mainly for creating
and destroying workers.
Setup The overall setup of the Turnip starts with two types of software deployments,
“manager” and “worker”, which behave in a client-server architecture where the manager
acts as a server and the worker acts as client. The manager is responsible for interacting
with the user, managing the workers, creating and destroying workers and distributing
work. The worker’s only responsibility is to process the work from the manager.
User Interface
The UI is web based stateless terminal, i.e. it does not include any concrete logic.
It uses the Request Manager for all communication and business logic. See the
User Interface paragraph below. This component exists on the manager.
Request Manager
The Request Manager is the heart of the manager. It is used by the UI and com-
municates to other components/servers for results. For example if the user wants
to add more workers, the Request Manager calls the worker factory to add or re-
move workers. If the user adds a new job, the Request Manager takes the job and
splits it into buckets and activates the worker for that job. This component exists
on the manager.
Worker Factory
The worker is responsible for taking care of the worker instances. It uses Cloud ASL
63
Manager
User Interface
ASL
Sunflow
Worker ..n
Worker 2
Worker 1
WorkerManager
Worker
Sunflow
Library
The Library is one of the critical parts of Cloud ASL and the Turnip. The Library
holds all interfaces for Cloud ASL. Services are registered as interfaces through
the Library. The Library also holds any external libraries used by the system, like
Sunflow for the rendering. The Library exists on both manager and worker.
ASL
This is our ASL implementation, Cloud ASL. It takes care of devices (creating, ter-
minating, starting and stopping) and components (installing, uninstalling, start-
ing and stopping). This component exists on the manager.
Worker
The worker is responsible for processing the jobs. The worker communicates with
the Request Manager, through r-OSGi, and pulls new buckets to render from
there. It utilizes Sunflow from the library component to render active bucket.
When finished rendering, it sends the bucket back to the Request Manager as an
array of colors.
Ray Tracing The ray-tracing software used as a basis for the Turnip is Sunflow9 , which
needed some modification to be able render on a distributed system. Helios10 was chosen
in this project as a model for Sunflow job distribution. An example of Sunflow rendered
image can be seen in figure 3.21.
A rendering job consists of a scene file and textures. The scene file defines objects in the
scene, such as a ball model with a texture here, a teapot model there with other texture
etc., and properties, such as camera location and direction and light sources. The texture
files are plain images which are applied to objects giving them a texture.
Rendering jobs are rendered with a bucket rendering technique. This means that each
job is split into multiple smaller jobs or buckets, and each of these buckets is a small part
of the resulting image. For example, if one decides to render a 100 pixel wide and 100
pixel tall image, that image could be split into 100 smaller images where each image is
10x10 pixel width/height. This rendering method is convenient for distribution.
Sunflow was leveraged by including the class files in the library component. This makes
sure that Sunflow is always available on all workers and the manager, however this makes
updates of Sunflow more complex than necessary. The rendering work-flow starts by the
user giving the Turnip a job to process. This means that the UI sends a message to the
9
Sunflow is an open source rendering system for photo-realistic image synthesis. It is written
in Java and built around a flexible ray tracing core and an extensible object-oriented design.
http://sunflow.sourceforge.net/
10
Helios is an distributed rendering system based on Sunflow and made for grid systems.
http://sfgrid.geneome.net/
65
worker manager, which then analysis the job, extracts information such as job complex-
ity and image size, and then creates image buckets for the job at hand. When this process
is over the worker manager sends a signal to the workers, through r-OSGi connection, to
start rendering. Each worker interacts with the request manager by requesting a bucket
to render and then sends a part of the image to the request manager when it has been
rendered. The request manager arranges the image parts to a result image based on the
bucket location. The current status of the image is regularly saved so that the user can
see the current status of the job. This process can be seen in figure 3.13.
Cloud operations The Turnip uses Cloud ASL as its basis for cloud computing opera-
tions. The cloud operations that the Turnip uses are starting one cloud computing in-
stance or terminating a cloud computing instance. This is done through the UI, where
a user starts a new rendering job or adds a new worker to the pool of running workers.
When the user adds a new worker, the UI sends a request to the worker manager, which
then runs a set of ASL operations; which starts a cloud instance and installs the com-
ponents required for the worker. The list of operations are shown at listing 3.3. The
worker factory starts a thread that waits for the new worker to come on-line. When it is
66
on-line it is registered through r-OSGi as a new worker service, which then again makes
the worker register the request manager as a service. This process can be viewed in figure
3.12.
1 Device d e v i c e = a s l . c r e a t e _ i n s t a n c e _ d e v i c e ( ”m1 . s m a l l ” ) ;
Component c 1 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_library . jar ” ) ;
3 Component c2 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/ remote
− 1 . 0 . 0 .RC4 . j a r ” ) ;
Component c3 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/com .
s p r i n g s o u r c e . org . codehaus . j a n i n o − 2 . 5 . 1 5 . j a r ” ) ;
5 Component c4 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_worker . j a r ” ) ;
7 asl . start_component ( c 1 ) ;
asl . start_component ( c2 ) ;
9 asl . start_component ( c3 ) ;
asl . start_component ( c4 ) ;
Listing 3.3: Start worker Cloud ASL script as used by worker manager
The User Interface The User Interface(see figure 3.22) of the Turnip was designed to
be simple but functional enough to do all required work in a “web 2.0” way. That is, it is
a simple HTML page, created and registered as a Java servlet through the OSGi HTTP
67
admin. All functionality of the web page is provided through AJAX11 interfaces which
include updating the status of a working job, adding workers, and viewing the Log. The
status update is done by the request manager call, which takes the current status of the
rendered image and saves it over the original image. This way the user can set the interval
of updates for the status.
11
Asynchronous JavaScript and XML
4. Evaluation
After designing and implementing Cloud ASL and our prototype, the results can be
evaluated and the system can be tested by running real jobs and observing how it per-
forms. Firstly a qualitative evaluation will be carried out to measure how the architec-
ture stands. Secondly, the system will have to perform a few tasks and its performance
is measured.
In the work done for this thesis, Cloud ASL has been created. It is a system to modify
software architecture through architectural operations, a prototype was also created to
utilize Cloud ASL. The question that now can be raised is, are these systems usable and
have they solved the set requirements? In this section we are going to answer this ques-
tion by evaluating the architecture by first utility and completeness and secondly quality
attributes
Utility denotes the applied use of the system for the goal of modelling and managing run-
time architecture of cloud computing applications, i.e. does the system enable others to
make dynamic software on cloud computing infrastructure and manage it with archi-
tectural operations, and if so how difficult is it? Completeness is evaluated in the terms
of the requirements of the system. The system should have resolved the requirements
presented in 3.4.1. The Turnip was created to demonstrate and argue for the utility of
the system.
Utility Utility is determined in terms of how the system, Cloud ASL, allows other de-
velopers to build dynamic software on clouds and manage it by architectural scripting.
The main parts of Cloud ASL have been implemented, and this process is the main foun-
dation for discussing utility. Cloud ASL has not been used or tested by other software
69
70
developers, which might be necessary to get a good picture of its utility. Now the utility
will be examined from views other than direct usage of others, a developer scenario will
be described and Cloud ASL compared to alternative ways of developing and modelling
architectural change.
The developer creates two separate Java applications, one to manage the cloud environ-
ment and other to be able to manage Helios/Sunflow through a web interface. This is a
similar approach as in the prototype, but Cloud ASL was used to manage the cloud.
When the developer has finished the implementation and has the system running, he/she
needs to update the software, e.g. upgrade Jini to a newer version. What the developer
would do in this case is to shutdown all the virtual instances, through the cloud managing
software, update the code on the virtual machine template and start with fresh instances
and updated code. Another, less intrusive but more complex way would be to update the
code on all running instances, which would then again require shutdown of the running
code. A better long term solution would be to create a configuration script or application
to do this automatically, which is again the aim of Cloud ASL.
As this specific scenario shows, Cloud ASL can have an advantage compared to no con-
figuration aimed solution.
Completeness To determine if the architecture for this system is complete, two ques-
tions need to be answered. One, does the architecture satisfy all of the requirements
identified in section 3.4.1? Second, is it possible to build cloud computing software us-
ing this implementation of Cloud ASL?
At first we should examine the functional requirements for the architectural prototype
from section 3.4.1.In short these requirements are:
As these requirements were the basis of the design they were followed closely. The pro-
totype was built on top of Cloud ASL and is closely integrated to it, which again enables
cloud connectivity by default. Because developing for Cloud ASL forces modular de-
velopment if done correctly, this requirement is accepted. Taking a closer look at these
requirements by analysing the use cases from section 3.4.1 a better picture can be painted
1
Jini is a network architecture for the construction of distributed systems in the form of modular co-
operating services. http://jini.org
71
As a user I can create ray-tracing jobs and choose on how many computing instances
they run at a time, and when a job is finished I can save the results of the ray trac-
ing job.
If we take a look at this use case in a literal context the requirements are met, but
there are few things that could have been done better. Currently the job selection
mechanism is hard-coded into the program, so the end user can not easily select
or modify a scene to render. This can however be done through Cloud ASL but a
better solution would be to distribute the rendering job through a file based dis-
tribution mechanism, such as Amazon S3 in case of AWS, but there does not seem
to be a standard distributed storage with cloud providers. The final result of a ren-
dering process is a picture which is easily stored. This result is not stored in the
cloud but should have been stored with the rendering job, which would be simple
if the distributed file based mechanism would have been implemented.
As a user I can shut down running computing instances after I have finished working
with them.
This feature was not fully integrated with the user interface, although it was fully
completed and tested for Cloud ASL scripts.
As a user I want to be able to monitor the result of the currently running ray tracing
job.
This use case was completed successfully. With a certain user specified interval a
picture with the current status of the rendering job, see figure 3.22, is presented
displaying which buckets are in process with workers and the log from the manager
can be viewed with a click of a mouse.
After we created the first version of the prototype a list was made of possible changes
that could or should be done that might improve the prototype, and this list became the
basis of quality attribute scenarios. These scenarios will be discussed in the following
section.
To describe and measure the quality of Cloud ASL, we made a list of possible change
scenarios early in the implementation phase, based on quality attribute scenarios for the
72
Turnip. These scenarios describe, for example, the change that has to be done to achieve
certain quality requirements, improved features or some other architecture change. Sec-
ondly, these scenarios were evaluated in regards to whether Cloud ASL could support
them and lastly it was concluded how these changes would be implemented through
Cloud ASL scripts if it could support them.
These scenarios were evaluated, and based on the outcome Cloud ASL scripts were cre-
ated.
Unsupported scenarios:
Scenarios that can not be supported by Cloud ASL and the prototype are:
3. Change of OSGi platform, from example Apache Felix to Equinox, table 4.3.
does not support it, because Cloud ASL uses a Apache Felix specific remote shell to in-
teract with the OSGi framework, our virtual image has pre-installed Apache Felix, and
changing OSGi platform would require modification on the virtual image thus making
it too complicated for this scenario.
Unimportant scenarios:
For each scenario two importance factors were made, priority and difficulty, and they
were measured from low to high. Scenarios were thought as important if importance
was more than 4 where importance = priority + dif f iculty and high difficulty = 1,
medium = 2 and low = 3 and high difficulty = 3, medium = 2 and low = 1. If a scenario
was thought to be both difficult and with low priority they were marked to be out of the
scope of the project. Scenarios that were not deemed important enough for the project
were:
Supported Scenarios: The rest of the scenarios are supported by Cloud ASL, they are:
9. Migrate from Amazon EC2 public cloud to Eucalyptus private cloud, table 4.15.
For each of these scenarios, the original quality attribute scenario and the Cloud ASL
script supporting it will be listed and the scenario discussed.
Device d e v i c e = a s l . c r e a t e _ i n s t a n c e _ d e v i c e ( ”m1 . s m a l l ” ) ;
2 Component c 1 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_library . jar ” ) ;
Component c2 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/ remote
− 1 . 0 . 0 .RC4 . j a r ” ) ;
4 Component c3 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/com .
s p r i n g s o u r c e . org . codehaus . j a n i n o − 2 . 5 . 1 5 . j a r ” ) ;
Component c4 = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_worker . j a r ” ) ;
6
asl . start_component ( c 1 ) ;
8 asl . start_component ( c2 ) ;
asl . start_component ( c3 ) ;
10 asl . start_component ( c4 ) ;
Shutdown of instances:
Although this scenario was tested and implemented in the project, it did not make it
into the final version of the user interface. In table 4.8 our quality attribute scenario is
displayed and in listing 4.2 we destroy a given device.
79
/ / p r e : d e v i c e i s running
2 destroy_device ( device ) ;
Turnip upgrade/update:
Here we might update a single component or upgrade the whole project. For a single
component a simple component update should be enough. But for a whole project more
drastic measures are needed. For our Cloud ASL script it is assumed that we are upgrad-
ing the whole software and will therefore shut down all instances and set everything up
from scratch. Here it is assumed that Cloud ASL is running from an external location,
i.e. not from the manager, and OSGi and basic required components are available on
the created cloud instance. In table 4.9 our quality attribute scenario is displayed and in
listing 4.3 we start by destroying all our devices and then create one instance and install
and start required components to run the manager.
80
/ / f i r s t we t e r m i n a t e a l l i n s t a n c e s .
2 Device [] d e v i c e s = a s l . g e t D e v i c e s ( )
f o r ( device in d e v i c e s )
4 a s l . destroy_device ( device )
6 / / t h e n we c r e a t e a manager
Device manager = a s l . c r e a t e _ i n s t a n c e _ d e v i c e ( ”m1 . s m a l l ” )
8 / / and i n s t a l l r e q u i r e d components
// Library
10 Component c 1 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/ l i b r a r y . j a r ” ) ;
/ / Cloud ~ASL
12 Component c2 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/ a s l . j a r ” ) ;
/ / User i n t e r f a c e
14 Component c3 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/ r o s g i . j a r ” ) ;
/ / User i n t e r f a c e
16 Component c4 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/UI . j a r ” ) ;
/ / R e q u e s t Manager
18 Component c5 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/ req_manager . j a r ”
);
/ / Worker F a c t o r y
20 Component c6 = a s l . i n s t a l l _ c o m p o n e n t ( manager , ” h t t p : / / URI/ w _ f a c t o r y . j a r ” ) ;
22 / / And t h e n we s t a r t t h e components
a s l . start_component ( c 1 ) ;
24 a s l . start_component ( c2 ) ;
a s l . start_component ( c3 ) ;
26 a s l . start_component ( c4 ) ;
a s l . start_component ( c5 ) ;
28 a s l . start_component ( c6 ) ;
30 / / As t h e manager i s c a p a b l e o f s t a r t i n g w o r k e r s
/ / t h e r e i s no need t o t o t h a t manually
81
1 / / p r e : Component ui e x i s t s on D e v i c e manager
a s l . stop_component ( u i ) ;
3 a s l . update_component ( ui , ” h t t p : / / URI/ updatedUI . j a r ” ) ;
a s l . start_component ( u i ) ;
/ / p r e : Component ui e x i s t s on D e v i c e manager
2 Component r e s t = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / URI/ newRESTService .
jar ” ) ;
a s l . start_component ( r e s t ) ;
4 / / now t h e new u s e r i n t e r f a c e can communicate v i a REST s e r v i c e s
2
http://boinc.berkeley.edu/: The Berkeley Open Infrastructure for Network Computing (BOINC) is a
non-commercial middle-ware system for volunteer and grid computing
83
/ / t o change t h e s e r v i c e t h e l i b r a r y
2 / / and t h e r e q u e s t manager need t o b e s w i t c h e d .
/ / p r e : Component rManager runs r e q u e s t manager and Component l i b runs
Library
4 a s l . stop_component ( l i b ) ;
a s l . stop_component ( rManager ) ;
6 a s l . update_component ( l i b , ” h t t p : / / URI/ b o i n c _ l i b . j a r ” ) ;
a s l . update_component ( rManager , ” h t t p : / / URI/ boinc_request_manager ” ) ;
8 a s l . start_component ( l i b )
a s l . start_component ( rManager )
Updating r-OSGi:
Because r-OSGi runs on all devices and connects all workers to the manager and it can
not be assumed that the new version reconnects lost connections correctly, all workers
will be terminated and the manager allowed to be in charge of recreating all workers.
In table 4.13 our quality attribute scenario is displayed and in listing 4.7 we start by
terminate all of our workers and stopping, update and starting r-OSGi again.
84
/ / f i r s t we t e r m i n a t e a l l i n s t a n c e s .
2 Device [] d e v i c e s = a s l . g e t D e v i c e s ( )
f o r ( device in d e v i c e s )
4 a s l . destroy_device ( device )
24 / / And t h e n we s t a r t t h e components
a s l . start_component ( c 1 ) ;
26 a s l . start_component ( c2 ) ;
a s l . start_component ( c3 ) ;
28 a s l . start_component ( c4 ) ;
a s l . start_component ( c5 ) ;
30 a s l . start_component ( c6 ) ;
87
32 / / B e c a u s e t h e manager i s c a p a b l e o f s t a r t i n g w o r k e r s
/ / t h e r e i s no need t o t h a t manually
4.2.1. Performance
The performance of systems can become a critical factor if it is going to be useful for real
world usage. The performance of Cloud ASL was measured by measuring it in compar-
ison to scripting manual architectural change. Two scripts were created, one Cloud ASL
script and one manual script, which had the same functionality. The manual script used
the same third party components as Cloud ASL to minimize the factor of uncertainty be-
tween different libraries. These scripts began by creating device instances and then did
a few component operations. These operations can be seen in the ASL script in listing
4.11:
1 Device d e v i c e = a s l . c r e a t e _ d e v i c e ( ”m1 . s m a l l ” )
Component com = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_library . jar ” )
3 a s l . start_component ( com )
a s l . stop_component ( com )
5 a s l . update_component ( com , ” h t t p : / / b j o l f u r . com/ t u r n i p _ l i b r a r y . j a r ” )
a s l . start_component ( com )
The groovy script was designed to do the exactly the same, with same libraries, see ap-
pendix A. In short, both the scripts do the following: start, by creating a Amazon EC2
instance through the Amazon AWS SDK; wait for the instance to run, when running it
used SSH to access the machine; set configuration switches and run the OSGi platform.
Next it uses telnet to interact with the OSGi platform and install, update start and stop
bundles/components. The time it took the scripts to run instances was measured as well
as the time it took to do operations on the OSGi platform (configuring components).
The results are listed in table 4.17, in figure 4.1 a scatter chart displays the startup times
of Cloud ASL scripts versus the manual startup times, and in figure 4.2 a scatter chart
displays the operations timing of Cloud ASL scripts versus the manual script times.
The scripts were run on many instances, and the number of instances was increased for
each iteration, with the average number of instances equalling 10. The time of starting
each instance was measured as well as the time of running the Cloud ASL/Manual op-
eration on these instances. Detailed measurements of timing can be found in appendix
B. Basic statistical analysis were done on the measurements; standard deviation, aver-
age, minimum and maximum data points; and they are listed in table 4.17. From these
results it can be observed that there is about one second penalty of using the Cloud ASL
script for creating devices and a one second penalty in component operations, but where
the standard deviation is higher than this difference it can be assumed that the time dif-
ference is not significant. It can be argued that as the system is more complex and has
more functional properties, it should result in less performance than a manual archi-
tectural change script. With basic code refactoring and optimization current difference
could be lowered. The error in these numbers is quite high, indicating a high uncer-
tainty factor in external environments. Because the cloud infrastructure does not have
uniform capability and timing, doing uniform tests would become very expensive and
time-consuming. Creating uniform testing would require a controlled private cloud or
a much larger sample from a public cloud as standard deviation and standard error for
creating devices with greater certainty would have to be figured out.
90
Figure 4.1.: A scatter chart of startup timing of Cloud ASL script vs. manual script
4.2.2. Scalability
After comparing Cloud ASL to performing manual architectural change through Groovy
and OSGi, it can be assumed that ASL performance is efficient, but it might be useful
to know how it scales. Because Cloud ASL is created to run and operate on multiple de-
vices, simultaneously doing that will be the next subject of evaluation. It was measured
by running ASL scripts on 1, 2, 4, 8, 16 and 32 instances at once. First a script was run
that created a device and did a few component operations, then a script was run four
times that did a few component operations on a currently running device to measure
any statistical difference by devices.
The first script used was a script which creates a device and does a few basic operations
on components:
Component com = a s l . i n s t a l l _ c o m p o n e n t ( device , ” h t t p : / / b j o l f u r . com/
turnip_library . jar ” ) ;
2 a s l . start_component ( com ) ;
a s l . stop_component ( com ) ;
4 a s l . stop_component ( com ) ;
a s l . update_component ( com , ” h t t p : / / b j o l f u r . com/ t u r n i p _ l i b r a r y . j a r ” ) ;
6 a s l . start_component ( com ) ;
Figure 4.2.: A scatter chart of operations timing of Cloud ASL script vs. manual script
The first script is the same as in the performance test, but this time the difference in
multiple number of instances was measured.
The results are shown in table 4.18 and full raw data of the tests are provided in appendix
C. Figure 4.3 displays a bar chart of averaged worker efficiency, i.e. average runtime,
minimum runtime, maximum runtime and standard deviation averaged over number
of workers running set job, and figure 4.4 displays an error chart of same data.
92
Number of devices
created 1 2 4 8 16 32
#Devices successfully 1 2 4 8 13 19
running
number of Cloud ASL 4 8 16 32 52 76
processes
Average timing of each 00:05.7 00:05.4 00:05.9 00:06.0 00:06.1 00:05.9
process
Min 00:05.3 00:04.4 00:04.5 00:04.3 00:05.0 00:04.9
Max 00:05.9 00:06.2 00:07.6 00:08.2 00:09.4 00:07.8
Standard deviation 00:00.2 00:00.7 00:00.8 00:00.8 00:00.7 00:00.7
Figure 4.3.: Bar chart of ASL performance, rendering efficiency with different amount
of workers
93
Figure 4.4.: Error chart of ASL performance, rendering efficiency with different amount
of workers
What was found out by doing these measurements is that Amazon has a limit of running
20 instances at a time, but up to 24 instances were running at a time (where we started
our instances all at once the Amazon API seems to be unable to detect the exact number
of running instances at given time), which limited the 32 device run, and trying to start
this many instances, 16 or 32, at once usually resulted in failures in starting some of
the devices. From what can be seen in table 4.18, there is not much penalty for running
Cloud ASL on multiple devices, however the variance runtime increases as the amount
of devices increases.
The prototype was also tested by rendering an image on different numbers of devices. A
rendering job was set up, which ran three times on 2, 4, 8 and 10 workers at the same
time, where each worker had an independent device for itself. The job was to render an
image of three aliens, with a resolution of 648x480, anti aliasing (4 samples), Gaussian
filter, etc. The result of this rendering can be seen in table 4.19. An example of rendering
a job with three workers can be viewed in figure 4.5 and figure 4.6 for rendering with ten
workers. In the rendering figures each coloured box represents a different worker and
the area under the box is the current bucket the worker is rendering.
94
Number of Workers avg. Render time avg. Render time per worker
3 06:44.3 20:12.9
5 04:08.8 20:44.2
8 02:34.0 20:31.7
10 02:04.2 20:41.7
The results from the prototype rendering are quite promising. It scales well where the
time it takes to complete a job is almost linear compared to the number of workers. From
the logs of the Amazon EC2 cloud it could be seen that in all cases the worker’s CPU was
fully utilized, while the manager was running on average 50% utilization. Serving the
user interface was what made the most impact on the manager, as for each user the UI
serves the current state of the rendering job to his/her preferences. This behavior does
not scale effectively but could be fixed with simple optimization.
Although Cloud ASL scripting is slower than the measured alternative, manual architec-
tural change script, the development time of writing Cloud ASL script is less then man-
ual script. In the example from section 4.2.1, a 5 line ASL script was created. However,
the alternative manual script consisted of more than 300 lines of code. While develop-
ing Cloud ASL, no performance optimizations were made, and it is predicted that with
simple optimizations, Cloud ASL would perform as well as manual scripts.
All the tests were made on Amazon EC2 “m1.small” type of instances on customized 32
bit Ubuntu 9.04 Linux. As all the measurements were done on complete rendering work
of all workers combined, there is no detailed information on the efficiency of individual
workers. In future work this could be an interesting part of analysing the difference
of efficiency between different types of cloud providers, cloud instance types and cloud
instances.
5. Discussions and Conclusions
In this section the results of this thesis will be discussed and summarized, what went
wrong, what might have been done better and future work on Cloud ASL and the Turnip.
In this thesis, Cloud ASL has been presented. It is an architectural scripting language
focused on infrastructure as a service (IaaS) cloud computing services. This tool, or
framework, enables controlling dynamic aspects of runtime software architecture with
architectural operations in cloud computing. This is a suitable framework to use for cre-
ating a scalable and modifiable cloud computing software. On top of this framework,
a prototype was created to evaluate the features of Cloud ASL for dynamic scaling and
runtime architectural changes. Although Cloud ASL is not a feature complete system,
and it might be difficult for inexperienced users to understand, it has a good educational
perspective by thinking of architectural changes with an architectural framework. Using
architectural scripting can help and simplify updates on software systems and software
architecture. When software developers are making software systems nowadays, modi-
fiability and scalability tend to become lower priority than they should. This is because
og costs, lack of time, and the fact that the tools used do not require thoughts on fu-
ture modifications and scalability. Cloud ASL could help here, as it requires a different
kind of thought, by forcing developers to create modular software where each module
is independent, in terms of it being able to maintain its state, while other modules are
not present, and as devices are a part of the framework, allowing developers to design
scalability from the start, making modifiability and scalability the core of the framework.
For this thesis, RHI (Reiknistofnun Háskóla Íslands) promised access to a part of a com-
puter cluster, a few nodes to start with. Full root access was not granted to these ma-
chines. Instead an administrator was relied on to set them up and install required soft-
ware for and with us, but due to how busy this administrator was and due to the complex
network and software setup it was not possible to finish setting up Eucalyptus. Having
a private cloud could have eased the first phase of the development given a steady access
to a cloud environment, but it could also have slowed down the work in later phases as
these solutions are not as complete and stable as the public cloud used in this project.
Since the main work on this thesis was completed, NASA (US Space Agency), in coop-
eration with the IaaS service provider Rackspace, introduced a new private IaaS service
software, OpenStack1 . This IaaS software is simpler than Eucalyptus and might have
1
OpenStack is a robust EC2 compatible IaaS software layer which includes an object store compatible
97
98
gotten a private cloud running if that offer had been available when this project started.
As the private cloud did not work, a few public cloud providers were signed up for, and
in the the end Amazon AWS was chosen for this project as it seemed the most supported
by other software libraries. Later, financial support was provided by Amazon to use their
service for the project. A virtual machine image was created with the software bundles
needed and stored in Amazon’s Simple Storage Service. This was a complex task and
very time consuming. Later, we found out a newer service from Amazon could have
been used, Elastic Block Storage, to store the virtual images which would have provided
more flexibility. For example it could have been possible to start and stop instances and
creating a new virtual machine image from an existing image could have been possible.
This would have given Cloud ASL more modifiability and eased development and test-
ing. Another Amazon AWS service that would have helped is Amazon Virtual Private
Cloud: with that service a virtual network could have been created and all network pro-
tocols enabled within that network. R-OSGi relies on multicast over SLP2 to be able to
automatically discover OSGi services over the network. Because multicast is disabled
on Amazon’s regular network, it was necessary to create the service connection manu-
ally, which caused more overhead and complexity. Although these services would have
helped in many ways, it would also force us to rely more on Amazon as a service provider
and make switching service providers a more difficult task.
The cloud computing integration into Cloud ASL began by using a cloud computing
abstraction framework, JCloud, which exposes interfaces to operate on multiple cloud
frameworks. This framework was not complete enough and did not expose common
cloud computing interactions well enough, so to use common cloud operations, such as
starting a virtual instance, it was necessary to go through the framework and interact di-
rectly with the Amazon API. A final version was not foreseeable in the near future and the
Amazon API was used directly instead. This problem brings a question for Cloud ASL:
Can Cloud ASL be used to support multiple IaaS providers?
Integrating more cloud computing providers into Cloud ASL would improve modifiabil-
ity, and that could be done by extracting cloud functionality into a Cloud ASL interface
and cloud operations into a Cloud ASL component and exposing specific cloud oper-
ations as a Cloud ASL service through the common cloud interface. This will limit the
functionality of Cloud ASL as cloud providers do not support the same functionality, and
therefore the cloud interface would only include common cloud provider operations.
virtual machine via SSH and monitor the components by running OSGi and viewing logs
and outputs. This is a time consuming and hard process, and becomes more complex
with a larger cloud. A future feature for Cloud ASL would be added testability, by remote
logging, remote debugging on multiple instances and/or distributed unit tests.
After this thesis was completed, Amazon EC2 introduced a new type of cloud computing
instance, cluster GPU instance, which is an instance containing powerful GPU’s includ-
ing multiple CPU’s. An interesting approach would be to optimize the Turnip to use
these GPU’s for rendering and measure any improvements and compare the efficiency
to other instances. This could also be the basis of some sort of benchmark, rendering
time compared to cost of instance.
For the implementation of the Turnip done in this project there was room for improve-
ment. Firstly, all libraries and interfaces were put in a single module, the library, which
existed on all devices. This simplified deployment, as only a single module had to be
deployed for all libraries and interfaces, but this made the library an oversized and com-
plicated module with difficult maintainance which could make updates on any library
or interface a relatively big task. A better solution would have been keeping all libraries
in separate modules and interfaces in one or more modules. Due to limited time, UI and
auto-scaling features were not implemented, such as automatic startup and shutdown of
devices, storing resulting images and uploading and customizing rendering jobs. These
features were not important for the Cloud ASL implementation but would have made
the Turnip more complete.
Making a ray-tracing program prototype for Cloud ASL took too much effort and time,
as the complexity of this prototype was too great, and this time should have been spent
focusing on the ASL implementation. The architecture of the prototype did therefore
not become as good as it should have been. Features were missing, and stability and
testability might have been better. Evolutionary architectural prototype would have been
useful in this situation, that is taking the good working parts of the prototype and iterate
the prototype a few times.
We have demonstrated with our work that Cloud ASL can be used as a high level ar-
chitectural framework for cloud computing and is a valid architectural language. With
some modifications and improvements Cloud ASL can become a basis for a Platform as
a Service with Java and OSGi as the platform.
At the end, a question arises. As our Cloud ASL implementation is based on Java and
does not support other programming languages, would a programming language neutral
implementation be a feasible and/or possible advance of Cloud ASL?
Bibliography
[8] M. Ingstrup and W. Zhang, “D4.8 self-* properties ddk prototype and report,” tech.
rep., Hydra, 2008.
[9] J. Kim and D. Garlan, “Analyzing architectural styles with alloy,” in Proceedings of
the ISSTA 2006 workshop on Role of software architecture for testing and analysis,
p. 80, ACM, 2006.
[11] D. Parkhill, The challenge of the computer utility. Addison-Wesley Reading, MA,
1966.
101
102
[12] L. Qian, Z. Luo, Y. Du, and L. Guo, “Cloud Computing: An Overview,” Cloud
Computing, pp. 626–631, 2009.
[14] D. Robert, “Jeff Bozes’ risky bet,” Business Week, vol. 4009, p. 53, 2006.
[16] P. Mell and T. Grance, “The NIST Definition of Cloud Computing. National Insti-
tute of Standards and Technology,” Information Technology Laboratory, Version,
vol. 15, pp. 10–7, 2009.
[17] C. Boulton, “Oracle CEO Larry Ellison Spits on Cloud Computing Hype,”
eWeek.com, September, vol. 29, pp. 11–14, 2009.
[21] N. Medvidovic and R. Taylor, “A classification and comparison framework for soft-
ware architecture description languages,” IEEE Transactions on software engineer-
ing, vol. 26, no. 1, pp. 70–93, 2000.
[23] P. Clements, D. Garlan, L. Bass, J. Stafford, R. Nord, J. Ivers, and R. Little, Doc-
umenting software architectures: views and beyond. Pearson Education, 2002.
[24] R. Kazman, M. Klein, and P. Clements, “ATAM: Method for Architecture Eval-
uation,” Tech. Rep. CMU/SEI-2000-TR-004, Software Engineering Institute,
2000.
103
[25] N. Rozanski and E. Woods, Software Systems Architecture: Working With Stake-
holders Using Viewpoints and Perspectives. Addison-Wesley Professional, 2005.
11 class test {
p u b l i c def t e s t I d
13
p u b l i c s t a t i c void main ( S t r i n g [] a r g s ) {
15
f o r ( i in 1 . . 1 ) {
17 println i ;
def newTest = new t e s t ( ) ;
19 newTest . t e s t I d = i ;
def th = Thread . s t a r t {
21 newTest . doTest ( )
}
23 }
105
106
25 }
27
p u b l i c def doTest ( ) {
29
// variables
31 S t r i n g a c c e s s K e y I d = ” a c c e s s key h e r e ” ;
S t r i n g s e c r e t K e y = ” s e c r e t key h e r e ” ;
33 S t r i n g ami = ” ami −4552bb2c ” ; / / l i n u x s e r v e r with Java + OSGi
S t r i n g k e y P r e f i x = ” g− t e s t −” ;
35 S t r i n g a v a i l a b i l i t y Z o n e = ” us−e a s t −1a ” ;
S t r i n g t y p e = ”m1 . s m a l l ”
37 def i d = n u l l
def s t a t e = ” pending ”
39
p r i v a t e AmazonEC2 ec2 ;
41 Instance instance = null
59 / / s e t t o zone
Placement placement = new Placement ( ) ;
61 placement . s e t A v a i l a b i l i t y Z o n e ( a v a i l a b i l i t y Z o n e ) ;
r e q u e s t . s e t P l a c e m e n t ( placement ) ;
63
/ / S e t t h e image ID t o a custom g e n e r a t e d AMI, which i n c l u d e Java +
OSGi
65 r e q u e s t . set Image Id ( ami ) ;
67 / / C r e a t e key p a i r f o r u s e r . .
C r e a t e K e y P a i r R e q u e s t kpReq = new C r e a t e K e y P a i r R e q u e s t ( ) ;
69
S t r i n g newKeyPairName = k e y P r e f i x + new Random ( ) . n e x t I n t ( ) ;
71 kpReq . setKeyName ( newKeyPairName ) ;
C r e a t e K e y P a i r R e s u l t k p r e s = ec2 . c r e a t e K e y P a i r ( kpReq ) ;
73 KeyPair k e y P a i r = k p r e s . g e t K e y P a i r ( ) ;
107
85 for ( Reservation r e s e r v a t i o n : r e s e r v a t i o n s ) {
i n s t a n c e s . addAll ( r e s e r v a t i o n . g e t I n s t a n c e s ( ) ) ;
87 i f ( reservation . getReservationId ( ) . equals ( ReservationId ) ) {
id = r e s e r v a t i o n . getInstances ( ) . get (0) . getInstanceId ( ) ;
89 s t a t e = r e s e r v a t i o n . g e t I n s t a n c e s ( ) . g e t ( 0 ) . g e t S t a t e ( ) . getName ( ) ;
System . out . p r i n t l n ( ” i n s t a n c e found ! ” ) ;
91 }
}
93
while ( ! s t a t e . e q u a l s I g n o r e C a s e ( ” running ” ) ) {
95 println ” state : ” + state
println ” id : ” + id
97 def d e s c r i b e I n s t a n c e s R e q u e s t = new D e s c r i b e I n s t a n c e s R e q u e s t ( ) ;
C o l l e c t i o n < S t r i n g > i n s t a n c e I d s = new A r r a y L i s t < S t r i n g > ( ) ;
99 i n s t a n c e I d s . add ( i d ) ;
describeInstancesRequest . setInstanceIds ( instanceIds ) ;
101
d e s c r i b e I n s t a n c e s R e s u l t = ec2 . d e s c r i b e I n s t a n c e s (
describeInstancesRequest ) ;
103 reservations = describeInstancesResult . getReservations () ;
105
for ( Reservation r e s e r v a t i o n : r e s e r v a t i o n s ) {
107 for ( Instance j : r e s e r v a t i o n . getInstances ( ) ) {
i f ( j . getInstanceId ( ) . equals ( id ) ) {
109 s t a t e = j . g e t S t a t e ( ) . getName ( ) ;
instance = j ;
111 }
}
113 }
}
115
System . gc ( ) ;
117 i n s t a n c e T i m e = System . c u r r e n t T i m e M i l l i s ( ) ;
119 / / i n s t a n c e i s running
108
169 try {
t e l n e t . c o n n e c t ( i n s t a n c e . g e t P u b l i c I p A d d r e s s ( ) , 6666) ;
171 System . out . p r i n t f ( ” t e l n e t . c o n n e c t(%s , 6666) ; ” , i n s t a n c e .
getPublicIpAddress () ) ;
259 System . gc ( ) ;
111
endTime = System . c u r r e n t T i m e M i l l i s ( ) ;
261
System . out . p r i n t l n ( t e s t I d + ” s t a r t T i m e = ” + s t a r t T i m e ) ;
263 System . out . p r i n t l n ( t e s t I d + ” i n s t a n c e T i m e = ” + i n s t a n c e T i m e ) ;
System . out . p r i n t l n ( t e s t I d + ” preTelnetTime = ” + preTelnetTime ) ;
265 System . out . p r i n t l n ( t e s t I d + ” endTime = ” + endTime ) ;
System . out . p r i n t l n ( t e s t I d + ” i n s t a n c e s t a r t u p took : ” + (
instanceTime − startTime ) ) ;
267 System . out . p r i n t l n ( t e s t I d + ” s s h s t a r t u p took : ” + ( preTelnetTime −
startTime ) ) ;
System . out . p r i n t l n ( t e s t I d + ” whole s t a r t u p took : ” + ( endTime −
startTime ) ) ;
269 }
271 p r i v a t e s t a t i c S t r i n g getOnlyNumerals ( S t r i n g s t r ) {
273 i f ( s t r == n u l l ) {
return n u l l ;
275 }
277 S t r i n g B u f f e r s t r B u f f = new S t r i n g B u f f e r ( ) ;
char c ;
279
f o r ( i n t i = 0; i < s t r . l e n g t h ( ) ; i ++) {
281 c = s t r . charAt ( i ) ;
283 i f ( Character . i s D i g i t ( c ) ) {
s t r B u f f . append ( c ) ;
285 }
}
287 return s t r B u f f . t o S t r i n g ( ) ;
}
289 }
113
114
1 2 3 4 5 6 7 8 9 10
1 1 01:26.2 00:06.7 00:05.9 00:05.8 00:05.3 00:05.9 00:05.7 01:32.9
2 1 01:25.3 00:07.4 00:05.6 00:04.4 00:04.4 00:04.9 00:04.8 01:32.6
2 01:25.1 00:10.1 00:06.0 00:05.8 00:06.0 00:06.2 00:06.0 01:35.1
4 1 01:25.1 00:07.3 00:07.4 00:05.9 00:06.0 00:06.2 00:06.4 01:32.4
2 01:25.0 00:07.3 00:06.1 00:05.6 00:05.2 00:05.9 00:05.7 01:32.3
3 02:29.8 00:07.3 00:07.6 00:05.9 00:06.0 00:06.2 00:06.4 02:37.1
4 01:12.7 00:07.1 00:05.4 00:04.5 00:04.6 00:04.9 00:04.9 01:19.9
8 1 01:39.2 00:07.2 00:06.2 00:05.9 00:06.1 00:06.0 00:06.0 01:46.4
2 01:37.5 00:07.8 00:06.2 00:06.0 00:05.9 00:06.2 00:06.1 01:45.3
3 01:32.5 00:07.4 00:05.9 00:06.0 00:05.9 00:05.9 00:05.9 01:39.9
4 01:26.6 00:07.6 00:06.1 00:05.8 00:05.3 00:05.9 00:05.8 01:34.2
5 01:40.0 00:07.1 00:06.1 00:05.9 00:05.9 00:05.8 00:05.9 01:47.1
6 01:21.0 00:09.2 00:06.2 00:06.1 00:06.1 00:06.0 00:06.1 01:30.2
7 01:41.0 00:07.2 00:06.2 00:05.8 00:05.9 00:05.2 00:05.8 01:48.2
8 -02:07.4 Instance failed to start
8 1 01:23.9 00:08.1 00:05.4 00:04.5 00:04.3 00:05.0 00:04.8 01:32.0
2 01:43.5 00:07.4 00:06.2 00:05.9 00:06.2 00:06.2 00:06.1 01:50.9
3 01:23.8 00:07.1 00:08.2 00:05.8 00:05.3 00:06.0 00:06.3 01:30.9
4 01:27.7 00:07.1 00:06.7 00:05.9 00:06.0 00:07.1 00:06.4 01:34.8
5 01:24.2 00:08.3 00:07.4 00:06.0 00:05.9 00:05.8 00:06.3 01:32.4
6 01:35.7 00:08.0 00:06.0 00:05.9 00:05.8 00:05.1 00:05.7 01:43.6
115
116