Module10-Autoscaling and Monitoring
Module10-Autoscaling and Monitoring
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Topics Activities
• Elastic Load Balancing • Elastic Load Balancing activity
• Amazon CloudWatch activity
• Amazon CloudWatch
• Amazon EC2 Auto Scaling
Lab
• Scale and Load Balance Your Architecture
Knowledge check
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 2
The module also includes two activities. One activity will challenge you to indicate Elastic
Load Balancing use cases. The other activity will challenge you to identify Amazon
CloudWatch examples.
The module also includes a hands-on lab where you will use Amazon EC2 Auto Scaling, Elastic
Load Balancing, and Amazon CloudWatch together to create a dynamically scalable
architecture.
Finally, you will be asked to complete a knowledge check that will test your understanding of
key concepts that are covered in this module.
Module objectives
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 3
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 5
Elastic Load Balancing is an AWS service that distributes incoming application or network
traffic across multiple targets—such as Amazon Elastic Compute Cloud (Amazon EC2)
instances, containers, internet protocol (IP) addresses, and Lambda functions—in a single
Availability Zone or across multiple Availability Zones. Elastic Load Balancing scales your load
balancer as traffic to your application changes over time. It can automatically scale to most
workloads.
Types of load balancers
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 6
• A Network Load Balancer operates at the network transport level (OSI model layer 4),
routing connections to targets—EC2 instances, microservices, and containers—based on
IP protocol data. It works well for load balancing both Transmission Control Protocol (TCP)
and User Datagram Protocol (UDP) traffic. A Network Load Balancer is capable of handling
millions of requests per second while maintaining ultra-low latencies. A Network Load
Balancer is optimized to handle sudden and volatile network traffic patterns.
• A Classic Load Balancer provides basic load balancing across multiple EC2 instances, and it
operates at both the application level and network transport level. A Classic Load Balancer
supports the load balancing of applications that use HTTP, HTTPS, TCP, and SSL. The Classic
Load Balancer is an older implementation. When possible, AWS recommends that you use
a dedicated Application Load Balancer or Network Load Balancer.
To learn more about the differences between the three types of load balancers, see Product
comparisons on the Elastic Load Balancing Features page.
How Elastic Load Balancing works
• With Application Load
Balancers and Network Load
Balancers, you register
targets in target groups, and AWS Cloud
route traffic to the target Load balancer accepts
incoming traffic
groups. Availability Zone A
from clients.
Availability Zone B
A load balancer accepts incoming traffic from clients and routes requests to its registered
targets (such as EC2 instances) in one or more Availability Zones.
You configure your load balancer to accept incoming traffic by specifying one or more
listeners. A listener is a process that checks for connection requests. It is configured with a
protocol and port number for connections from clients to the load balancer. Similarly, it is
configured with a protocol and port number for connections from the load balancer to the
targets.
You can also configure your load balancer to perform health checks, which are used to
monitor the health of the registered targets so that the load balancer only sends requests to
the healthy instances. When the load balancer detects an unhealthy target, it stops routing
traffic to that target. It then resumes routing traffic to that target when it detects that the
target is healthy again.
There is a key difference in how the load balancer types are configured. With Application
Load Balancers and Network Load Balancers, you register targets in target groups, and route
traffic to the target groups. With Classic Load Balancers, you register instances with the load
balancer.
Elastic Load Balancing use cases
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 8
• Automatically scale your applications – Elastic Load Balancing works with Amazon
CloudWatch and Amazon EC2 Auto Scaling to help you scale your applications to the
demands of your customers. Amazon CloudWatch alarms can trigger auto scaling for your
EC2 instance fleet when the latency of any one of your EC2 instances exceeds a
preconfigured threshold. Amazon EC2 Auto Scaling then provisions new instances and
your applications will be ready to serve the next customer request. The load balancer will
register the EC2 instance and direct traffic to it as needed.
• Use Elastic Load Balancing in your virtual private cloud (VPC) – You can use Elastic Load
Balancing to create a public entry point into your VPC, or to route request traffic between
tiers of your application within your VPC. You can assign security groups to your load
balancer to control which ports are open to a list of allowed sources. Because Elastic Load
Balancing works with your VPC, all your existing network access control lists (network
ACLs) and routing tables continue to provide additional network controls. When you
create a load balancer in your VPC, you can specify whether the load balancer is public
(default) or internal. If you select internal, you do not need to have an internet gateway to
reach the load balancer, and the private IP addresses of the load balancer will be used in
the load balancer’s Domain Name System (DNS) record.
• Enable hybrid load balancing – Elastic Load Balancing enables you to load balance across
AWS and on-premises resources by using the same load balancer. For example, if you
must distribute application traffic across both AWS and on-premises resources, you can
register all the resources to the same target group and associate the target group with a
load balancer. Alternatively, you can use DNS-based weighted load balancing across AWS
and on-premises resources by using two load balancers, with one load balancer for AWS
and other load balancer for on-premises resources. You can also use hybrid load balancing
to benefit separate applications where one application is in a VPC and the other
application is in an on-premises location. Put the VPC targets in one target group and the
on-premises targets in another target group, and then use content-based routing to route
traffic to each target group.
• Invoking Lambda functions over HTTP(S) – Elastic Load Balancing supports invoking
Lambda functions to serve HTTP(S) requests. This enables users to access serverless
applications from any HTTP client, including web browsers. You can register Lambda
functions as targets and use the support for content-based routing rules in Application
Load Balancers to route requests to different Lambda functions. You can use an
Application Load Balancer as a common HTTP endpoint for applications that use servers
and serverless computing. You can build an entire website by using Lambda functions, or
combine EC2 instances, containers, on-premises servers, and Lambda functions to build
applications.
Activity: Elastic Load Balancing
You have extremely spiky and unpredictable TCP Network Load Balancer
traffic.
You need simple load balancing with multiple Classic Load Balancer
protocols.
You need a load balancer that can handle millions of Network Load Balancer
requests per second while maintaining low latencies.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 9
For this activity, name the load balancer you would use for the given scenario.
Load balancer monitoring
You can use the following features to monitor your load balancers, analyze traffic patterns,
and troubleshoot issues with your load balancers and targets:
• Amazon CloudWatch metrics – Elastic Load Balancing publishes data points to Amazon
CloudWatch for your load balancers and your targets. CloudWatch enables you to retrieve
statistics about those data points as an ordered set of time series data, known as metrics.
You can use metrics to verify that your system is performing as expected. For example,
you can create a CloudWatch alarm to monitor a specified metric and initiate an action
(such as sending a notification to an email address) if the metric goes outside what you
consider an acceptable range.
• Access logs – You can use access logs to capture detailed information about the requests
that were made to your load balancer and store them as log files in Amazon Simple
Storage Service (Amazon S3). You can use these access logs to analyze traffic patterns and
to troubleshoot issues with your targets or backend applications.
• AWS CloudTrail logs – You can use AWS CloudTrail to capture detailed information about
the calls that were made to the Elastic Load Balancing application programming interface
(API) and store them as log files in Amazon S3. You can use these CloudTrail logs to
determine who made the call, what calls were made, when the call was made, the source
IP address of where the call came from, and so on.
• Elastic Load Balancing distributes
Section 1 key incoming application or network
takeaways traffic across multiple targets in one
or more Availability Zones.
• Elastic Load Balancing supports
three types of load balancers:
• Application Load Balancer
• Network Load Balancer
• Classic Load Balancer
• ELB offers instance health checks,
security, and monitoring.
11 © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
To use AWS efficiently, you need insight into your AWS resources:
• How do you know when you should launch more Amazon EC2 instances?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 13
To use AWS efficiently, you need insight into your AWS resources.
• Monitors –
• AWS resources
• Applications that run on AWS
• Collects and tracks –
Amazon • Standard metrics
CloudWatch • Custom metrics
• Alarms –
• Send notifications to an Amazon SNS
topic
• Perform Amazon EC2 Auto Scaling or
Amazon EC2 actions
• Events –
• Define rules to match changes in AWS
environment and route these events to
one or more target functions or
streams for processing
© 2019 Amazon Web Services, Inc. or its Affiliates. All rights reserved. 14
Amazon CloudWatch is a monitoring and observability service that is built for DevOps
engineers, developers, site reliability engineers (SRE), and IT managers. CloudWatch monitors
your AWS resources (and the applications that you run on AWS) in real time. You can use
CloudWatch to collect and track metrics, which are variables that you can measure for your
resources and applications.
You can create an alarm to monitor any Amazon CloudWatch metric in your account and use
the alarm to automatically send a notification to an Amazon Simple Notification Service
(Amazon SNS) topic or perform an Amazon EC2 Auto Scaling or Amazon EC2 action. For
example, you can create alarms on the CPU utilization of an EC2 instance, Elastic Load
Balancing request latency, Amazon DynamoDB table throughput, Amazon Simple Queue
Service (Amazon SQS) queue length, or even the charges on your AWS bill. You can also
create an alarm on custom metrics that are specific to your custom applications or
infrastructure.
You can also use Amazon CloudWatch Events to define rules that match incoming events (or
changes in your AWS environment) and route them to targets for processing. Targets can
include Amazon EC2 instances, AWS Lambda functions, Kinesis streams, Amazon ECS tasks,
Step Functions state machines, Amazon SNS topics, Amazon SQS queues, and built-in targets.
CloudWatch Events becomes aware of operational changes as they occur. CloudWatch Events
responds to these operational changes and takes corrective action as necessary, by sending
messages to respond to the environment, activating functions, making changes, and
capturing state information.
With CloudWatch, you gain system-wide visibility into resource utilization, application
performance, and operational health. There is no upfront commitment or minimum fee; you
simply pay for what you use. You are charged at the end of the month for what you use.
CloudWatch alarms
You can create a CloudWatch alarm that watches a single CloudWatch metric or the result of
a math expression based on CloudWatch metrics. You can create a CloudWatch alarm based
on a static threshold, anomaly detection, or a metric math expression.
When you create an alarm based on a static threshold, you choose a CloudWatch metric for
the alarm to watch and the threshold for that metric. The alarm goes to ALARM state when
the metric breaches the threshold for a specified number of evaluation periods.
For this activity, see if you can identify which are correct CloudWatch alarms. For the ones
that are incorrect, see if you can identify the error.
• Amazon CloudWatch helps you
Section 2 key monitor your AWS resources—and the
takeaways applications that you run on AWS—in
real time.
• CloudWatch enables you to –
• Collect and track standard and custom
metrics.
• Set alarms to automatically send
notifications to SNS topics, or perform
Amazon EC2 Auto Scaling or Amazon EC2
actions.
• Define rules that match changes in your
AWS environment and route these events
to targets for processing.
17 © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
When you run your applications on AWS, you want to ensure that your architecture can scale
to handle changes in demand. In this section, you will learn how to automatically scale your
EC2 instances with Amazon EC2 Auto Scaling.
Why is scaling important?
Unused Over
capacity capacity
Su M T W Th F Sa Su M T W Th F Sa
© 2019 Amazon Web Services, Inc. or its Affiliates. All rights reserved. 19
Scaling is the ability to increase or decrease the compute capacity of your application. To
understand why scaling is important, consider this example of a workload that has varying
resource requirements. In this example, the most resource capacity is required on
Wednesday, and the least resource capacity is required on Sunday.
One option is to allocate more than enough capacity so you can always meet your highest
demand—in this case, Wednesday. However, this situation means that you are running
resources that will be underutilized most days of the week. With this option, your costs are
not optimized.
Another option is to allocate less capacity to reduce costs. This situation means that you are
under capacity on certain days. If you don't solve your capacity problem, your application
could underperform or potentially even become unavailable for users.
Amazon EC2 Auto Scaling
© 2019 Amazon Web Services, Inc. or its Affiliates. All rights reserved. 20
In the cloud, because computing power is a programmatic resource, you can take a flexible
approach to scaling. Amazon EC2 Auto Scaling is an AWS service that helps you maintain
application availability and enables you to automatically add or remove EC2 instances
according to conditions you define. You can use the fleet management features of EC2 Auto
Scaling to maintain the health and availability of your fleet.
Amazon EC2 Auto Scaling provides several ways to adjust scaling to best meet the needs of
your applications. You can add or remove EC2 instances manually, on a schedule, in response
to changing demand, or in combination with AWS Auto Scaling for predictive scaling. Dynamic
scaling and predictive scaling can be used together to scale faster.
To learn more about Amazon EC2 Auto Scaling, see the Amazon EC2 Auto Scaling product
page.
Typical weekly traffic at Amazon.com
Provisioned capacity
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 21
Automatic scaling is useful for predictable workloads—for example, the weekly traffic at the
retail company Amazon.com.
November traffic to Amazon.com
November
24 percent
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 22
Automatic scaling is also useful for dynamic on-demand scaling. Amazon.com experiences a
seasonal peak in traffic in November (on Black Friday and Cyber Monday, which are days at
the end of November when US retailers hold major sales). If Amazon provisions a fixed
capacity to accommodate the highest use, 76 percent of the resources are idle for most of
the year. Capacity scaling is necessary to support the fluctuating demands for service.
Without scaling, the servers could crash due to saturation, and the business would lose
customer confidence.
Auto Scaling groups
Desired capacity
Maximum size
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 23
An Auto Scaling group is a collection of Amazon EC2 instances that are treated as a logical
grouping for the purposes of automatic scaling and management. The size of an Auto Scaling
group depends on the number of instances you set as the desired capacity. You can adjust its
size to meet demand, either manually or by using automatic scaling.
You can specify the minimum number of instances in each Auto Scaling group, and Amazon
EC2 Auto Scaling is designed to prevent your group from going below this size. You can
specify the maximum number of instances in each Auto Scaling group, and Amazon EC2 Auto
Scaling is designed to prevent your group from going above this size. If you specify the
desired capacity, either when you create the group or at any time afterwards, Amazon EC2
Auto Scaling is designed to adjust the size of your group so it has the specified number of
instances. If you specify scaling policies, then Amazon EC2 Auto Scaling can launch or
terminate instances as demand on your application increases or decreases.
For example, this Auto Scaling group has a minimum size of one instance, a desired capacity
of two instances, and a maximum size of four instances. The scaling policies that you define
adjust the number of instances within your minimum and maximum number of instances,
based on the criteria that you specify.
Scaling out versus scaling in
Elastic Load
Balancing
With Amazon EC2 Auto Scaling, launching instances is referred to as scaling out, and
terminating instances is referred to as scaling in.
How Amazon EC2 Auto Scaling works
What Where When
VPC Maintain current number
Private subnet • Health checks
AMI
Scheduled scaling
Launch configuration Auto Scaling group • Scheduled actions
• AMI • VPC and subnets
• Instance type • Load balancer Dynamic scaling
• IAM role • Scaling policies
• Security groups
• EBS volumes Predictive scaling
• AWS Auto Scaling
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 25
To launch EC2 instances, an Auto Scaling group uses a launch configuration, which is an
instance configuration template. You can think of a launch configuration as what you are
scaling. When you create a launch configuration, you specify information for the instances.
The information you specify includes the ID of the Amazon Machine Image (AMI), the
instance type, AWS Identity and Access Management (IAM) role, additional storage, one or
more security groups, and any Amazon Elastic Block Store (Amazon EBS) volumes.
You define the minimum and maximum number of instances and desired capacity of your
Auto Scaling group. Then, you launch it into a subnet within a VPC (you can think of this as
where you are scaling). Amazon EC2 Auto Scaling integrates with Elastic Load Balancing to
enable you to attach one or more load balancers to an existing Auto Scaling group. After you
attach the load balancer, it automatically registers the instances in the group and distributes
incoming traffic across the instances.
Finally, you specify when you want the scaling event to occur. You have many scaling options:
• Maintain current instance levels at all times – You can configure your Auto Scaling group
to maintain a specified number of running instances at all times. To maintain the current
instance levels, Amazon EC2 Auto Scaling performs a periodic health check on running
instances in an Auto Scaling group. When Amazon EC2 Auto Scaling finds an unhealthy
instance, it terminates that instance and launches a new one.
• Manual scaling – With manual scaling, you specify only the change in the maximum,
minimum, or desired capacity of your Auto Scaling group.
• Scheduled scaling – With scheduled scaling, scaling actions are performed automatically as
a function of date and time. This is useful for predictable workloads when you know
exactly when to increase or decrease the number of instances in your group. For example,
say that every week, the traffic to your web application starts to increase on Wednesday,
remains high on Thursday, and starts to decrease on Friday. You can plan your scaling
actions based on the predictable traffic patterns of your web application. To implement
scheduled scaling, you create a scheduled action.
• Dynamic, on-demand scaling – A more advanced way to scale your resources enables you
to define parameters that control the scaling process. For example, you have a web
application that currently runs on two instances and you want the CPU utilization of the
Auto Scaling group to stay close to 50 percent when the load on the application changes.
This option is useful for scaling in response to changing conditions, when you don't know
when those conditions will change. Dynamic scaling gives you extra capacity to handle
traffic spikes without maintaining an excessive amount of idle resources. You can
configure your Auto Scaling group to scale automatically to meet this need. The scaling
policy type determines how the scaling action is performed. You can use Amazon EC2 Auto
Scaling with Amazon CloudWatch to trigger the scaling policy in response to an alarm.
• Predictive scaling – You can use Amazon EC2 Auto Scaling with AWS Auto Scaling to
implement predictive scaling, where your capacity scales based on predicted demand.
Predictive scaling uses data that is collected from your actual EC2 usage, and the data is
further informed by billions of data points that are drawn from our own observations.
AWS then uses well-trained machine learning models to predict your expected traffic (and
EC2 usage), including daily and weekly patterns. The model needs at least 1 day of
historical data to start making predictions. It is re-evaluated every 24 hours to create a
forecast for the next 48 hours. The prediction process produces a scaling plan that can
drive one or more groups of automatically scaled EC2 instances.
To learn more about these options, see Scaling the Size of Your Auto Scaling Group in the
AWS Documentation.
Implementing dynamic scaling
CPU
utilization If average CPU
utilization is
> 60% for 5
minutes…
Amazon CloudWatch, Amazon EC2 Auto Scaling, and Elastic Load Balancing work well
individually. Together, however, they become more powerful and increase the control and
flexibility over how your application handles customer demand.
AWS Auto Scaling
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 27
So far, you learned about scaling EC2 instances with Amazon EC2 Auto Scaling. You also
learned that you can use Amazon EC2 Auto Scaling with AWS Auto Scaling to perform
predictive scaling.
AWS Auto Scaling is a separate service that monitors your applications. It automatically
adjusts capacity to maintain steady, predictable performance at the lowest possible cost. The
service provides a simple, powerful user interface that enables you to build scaling plans for
resources, including:
If you are already using Amazon EC2 Auto Scaling to dynamically scale your EC2 instances,
you can now use it with AWS Auto Scaling to scale additional resources for other AWS
services.
To learn more information about AWS Auto Scaling, see AWS Auto Scaling.
• Scaling enables you to respond quickly to
Section 3 key changes in resource needs.
• Amazon EC2 Auto Scaling maintains
takeaways application availability by automatically
adding or removing EC2 instances.
• An Auto Scaling group is a collection of EC2
instances.
• A launch configuration is an instance
configuration template.
• Dynamic scaling uses Amazon EC2 Auto
Scaling, CloudWatch, and Elastic Load
Balancing.
• AWS Auto Scaling is a separate service
from Amazon EC2 Auto Scaling.
28 © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Auto Scaling is a separate service that monitors your applications, and it automatically
adjusts capacity for the following resources:
• Amazon EC2 instances and Spot Fleets
• Amazon ECS tasks
• Amazon DynamoDB tables and indexes
• Amazon Aurora Replicas
Lab 6:
Scale and Load
Balance Your
Architecture
29 © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
You will now complete Lab 6: Scale and Load Balance Your Architecture.
Lab 6: Scenario
AWS Cloud
Region
Availability Zone A Availability Zone B
VPC: 10.0.0.0/16 Internet Public subnet 2:
Public subnet 1: gateway 10.0.2.0/24
10.0.0.0/24
Security group
NAT gateway
Web Server 1
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 30
In this lab, you will use Elastic Load Balancing and Amazon EC2 Auto Scaling to load balance
and scale your infrastructure. You will start with the given infrastructure.
Lab 6: Tasks
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 31
NAT gateway
Application
Private subnet 1: Load Balancer Private subnet 2:
10.0.1.0/24 10.0.3.0/24
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 32
The diagram summarizes what you will have built after you complete the lab.
Begin lab 6
~ 30 minutes
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 33
It is now time to start the lab. It should take you approximately 30 minutes to complete the
lab.
Lab debrief:
Key takeaways
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 34
Module wrap-up
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s now time to review the module, and wrap up with a knowledge check and discussion of a
practice certification exam question.
Module summary
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 36
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 37
Which service would you use to send alerts based on Amazon CloudWatch alarms?
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 38
Look at the answer choices and rule them out based on the keywords that were previously
highlighted.
Thank you
© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved. This work may not be reproduced or redistributed, in whole or in part, without prior written permission from Amazon
Web Services, Inc. Commercial copying, lending, or selling is prohibited. Corrections or feedback on the course, please email us at: aws-course-feedback@amazon.com. For all other
questions, contact us at: https://aws.amazon.com/contact-us/aws-training/. All trademarks are the property of their owners.