Cloud Computing For Data Science
Cloud Computing For Data Science
OR
In short,
We store, manage & process data on remote servers.
1. Service Providers :-
1. Google Cloud
2. AWS(Amazon Web Service)
3. Microsoft Azure
4. IBM Cloud
5. Alibaba Cloud etc.,
2. Types of Cloud :-
1. Public :-Accessible to all.
2. Private :- Servers accessible within an organization.
3. Hybrid :-Public + Private cloud computing features.
4. Community :-Services accessible by a group of organization.
1. Distributed System :-
1. It is composition of multiple independent system but all of them
are depicted as a single entity to the users.
2. The purpose of distributed system is to share resources and also use
them effectively and efficiently.
3. Distributed system posses characteristics such as scalability,
concurrency, continuous availability, heterogenetic and
independence in failure.
4. But the main problem with this system was that all systems were
required to be present in same geographical location.
5. Thus to solve this problem, distributed computing led to there more
types of computing and they are :
Mainframe computing :-
It came into existence in 1951, It is highly powerful
and reliable computer machine.
It is responsible to handle large data such as massive
input output operation.
These systems have almost no downtime with high
fault tolerance.
It is very expensive.
The reduce the cost cluster computing and grid
computing came as alternative to mainframe
technology.
Cluster Computing :-
Nodes must be homogeneous they should have same
type of hardware and operating systems.
Computer are located close to each other.
Computers are connected by high speed local area
network bus.
Computers are connected in centralized network
topology.
Scheduling is controlled by central server.
Whole system has centralized resource manager.
Whole systems functions as a single system.
It’s used in web logic application servers, database
etc.,
Grid Computing :-
Nodes may have different operating system and
hardware machine can be homogenous an
heterogeneous.
Computers may be located at a huge distance from
one another.
Computers are connected using low speed bus or
the internet.
Computer are connected in distributed or
decentralized network topology.
It has many servers, but mostly each nodes
behaviours independently.
Every node is autonomous an anyone can opt out
anytime.
It’s used in predictive modelling, automation,
simultaneous etc.,
2.Virtualization :-
It’s came into existence 40 years back and it is becoming the current
technique used in IT firms.
It employs a software layer over the hardware and using this it provides
the customer with cloud based services.
3. WEB 2.0 :-
The computing lets the users generate their content and collaborate
with other peoples or share the information using social media.
Web 2.0 is a combination of the second generation technology
“World Wide Web(WWW)” along with web services and it is the
computing type that is used today.
4. Service Orientation :-
Example :-
Google Docs
Service Provider
Service Consumer
Web Services :-
Grid Computing :-
1. Grid Computing is a collection of computer resources from
multiple locations to reach a common goal.
2. The grid can be thought as a distributed system with non
interactive workload that involve a large number of files.
3. A computational grid is a software and hardware infrastructure that
provides dependable, consistent and inexpensive access to high end
computational capabilities.
4. The links together computing resources(PC’s , workstation,
servers, storage elements) and provides a mechanism to access
them
How Grid Computing works :-
1. In general grid computing system requires, at least one
computer, usually a server, which handles all the
administrative duties of the system.
Utility Computing :-
1. Utility computing is a service provisioning model that offers
computing resources such as hardware, software, and network
bandwidth to client as and when they require them on an on-
demand basis. The service provider charges only as per the
consumption of the services, rather than a fixed charge or a flat
rates.
2. Utility computing is a subset of cloud computing.
3. Utility computing is a model where computing resources, such as
processing power, storage and applications are provided to users on
demand, much like a traditional utility such as electricity or water.
Users typically pay for these resources on a metered basis, meaning
they are charged for the actual amount of resources they consume,
rather than a flat fee.
4. This model offers several advantages, including scalability,
flexibility, and cost-efficiency. Users can easily scale their
computing resources up or down based on their current needs,
without having to invest in and manage their own infrastructure.
Additionally, users can access these resources from anywhere with
an internet connection, making it particularly attractive for
businesses with dynamic or unpredictable computing needs.
Hardware Virtualization :-
It is the abstraction of computing resources from the software that uses
cloud resources. It involves embedding virtual machine software into the
server's hardware components. That software is called the hypervisor.
The hypervisor manages the shared physical hardware resources between
the guest OS & the host OS. The abstracted hardware is represented as
actual hardware. Virtualization means abstraction & hardware
virtualization is achieved by abstracting the physical hardware part using
Virtual Machine Monitor (VMM) or hypervisor. Hypervisors rely on
command set extensions in the processors to accelerate common
virtualization activities for boosting the performance. The term hardware
virtualization is used when VMM or virtual machine software or any
hypervisor gets directly installed on the hardware system. The primary
task of the hypervisor is to process monitoring, memory & hardware
controlling. After hardware virtualization is done, different operating
systems can be installed, and various applications can run on it. Hardware
virtualization, when done for server platforms, is also called server
virtualization.
Types of Hardware Virtualization:-
Full Virtualization: Here the hardware architecture is
completely simulated. Guest software doesn't need any
modification to run any applications.
Emulation Virtualization: Here the virtual machine
simulates the hardware & is independent. Furthermore,
the guest OS doesn't require any modification.
Para-Virtualization: Here, the hardware is not simulated;
instead the guest software runs its isolated system.