Being Well-Architected in The Cloud

Being Well-Architected in the Cloud
Adrian Hornsby, Technical Evangelist @ AWS
Twitter: @adhorn
Email: adhorn@amazon.com
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Technical Evangelist, Developer Advocate,
… Software Engineer
• Own bed in Finland
• Previously:
• Solutions Architect @AWS
• Lead Cloud Architect @Dreambroker
• Director of Engineering, Software Engineer, DevOps, Manager, ... @Hdm
• Researcher @Nokia Research Center
• and a bunch of other stuff.
• Climber, like Ginger shots.
What to expect from the session
1. What is the Well-Architected framework

2. Framework Overview
3. How to be Well-Architected
4. Conclusion
What is the Well-Architected
Framework?
Customer Challenges
Faster response to change Delivery time Change Management Reduce human errors
in market
Scaling to demand Faster recovery High availability Automation

Download the
Whitepaper
AWS well-architected framework
Set of questions you can use to evaluate how well an architecture is
aligned to AWS best practices
Security Reliability Performance Cost optimization Operational

efficiency excellence
Couple of fundamentals
AWS Global Infrastructure
16
Regions
42 Availability Zones
Building Blocks
Server
Amazon
EC2 instance S3
Subnet
Amazon
CloudWatch
Availability Availability Availability

Zone A Zone B Zone C
Region
Security pillar
Security pillar
Protect information, systems, and assets while delivering business value
through risk assessments and mitigation strategies
Security at all layers Enable traceability
Implement a principle Focus on securing Automate security

of least privilege system best practices
Shared Responsibility
Customers
Customer applications & content
Platform, Applications, Identity & Access Management
Operating System, Network, and Firewall Configuration

Client-side Data Server-side Data Network Traffic
Encryption Encryption Protection
AWS Foundation Services
Compute Storage Database Networking
Availability Zones
AWS Global Edge
Infrastructure Locations
Regions
Credentials
• Enforce MFA for everyone from day 1.

• Use AWS IAM Users and Roles from day 1.
• Enforce strong passwords.
• Protect and rotate credentials.
• No access keys in code.
EC2 Role
1: Create EC2 role

Create role in IAM service with
2: Launch EC2 instance
limited policy
Launch instance with role
3: App retrieves credentials

Instance Using AWS SDK application
retrieves temporary credentials
4: App accesses AWS resource(s)

Using AWS SDK application uses
credentials to access resource(s)
IAM Policies
{
"Version": "2012-10-17",
"Statement":
[
{
"Sid": "AddPerm",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
}
]
}
Network and Boundary
• Security groups are built-in stateful firewalls
• Divide layers of the stack into subnets
• Use a bastion host for access
• Implement host based controls

Layers with Security Groups
User
WEB WEB
Server Security Group
Web Subnet A
DB
Security Group
RDS DB Instance
DB Subnet A
Availability Zone A
Bastion Host & Security Groups
Developer
WEB WEB Bastion

Bastion
Security Group
> start_bastion
Server Security Group Host Port 22
IP restriction > ssh -A
> stop_bastion
Public Subnet A
DB
Security Group
RDS DB Instance
Private Subnet A
Availability Zone A
Monitoring and Auditing
• Capture & audit AWS CloudTrail, Amazon VPC and

Amazon CloudWatch logs.
• Collect all logs centrally.
• Setup alerts.
Amazon Virtual AWS AWS Key AWS AWS

Private Cloud Identity & Management CloudTrail Config
Access Service
Manager
Audit logs for all operations
Store/Archive
Troubleshoot
Monitor & Alarm

Verify everything, always, with AWS Config
Reliability pillar
Reliability pillar
Ability of a system to recover from infrastructure or service disruptions,
dynamically acquire computing resources to meet demand, and mitigate
disruptions such as misconfigurations or transient network issues
Test recovery Automatically Scale horizontally to Stop guessing

procedures recover from failure increase availability capacity
High Availability
• No Single Point of Failure
• Multiple Availability Zones
• Load Balancing
• Auto Scaling and Healing

Multi-AZ Architecture
Available & redundant application
Amazon Amazon
User Route 53 CloudFront
Amazon S3
Load
balancer
Web Web
Instance Instance
RDS DB Instance RDS DB Instance

Active (Multi-AZ) Standby (Multi-AZ)
Availability Zone Availability Zone
Weekly traffic pattern
Sunday Monday Tuesday Wednesday Thursday Friday Saturday

Auto Scaling
• Maintain your Amazon EC2 instance availability

• Automatically Scale Up and Down your EC2 Fleet
• Scale based on CPU, Memory or Custom metrics
Auto scaling groups
Amazon Amazon
User Route 53 CloudFront
Amazon S3
ELB
Auto-Scaling group
Web Web
Instances ElastiCache
Instances
RDS DB Instance RDS DB Instance

Active (Multi-AZ) Standby (Multi-AZ)
Availability Zone Availability Zone
Backup and DR
• Define Objectives
• Backup Strategy
• Periodic Recovery Testing
• Automated Recovery
• Periodic Reviews
Performance efficiency pillar
Performance efficiency pillar
Efficiently use of computing resources to meet requirements, and
maintaining that efficiency as demand changes and technologies evolve
Democratize Go global in Use the right Experiment more

advanced minutes architectures for often
technologies your backend and
databases
Right Sizing
• Reference Architecture
• Quick Start Reference Deployments
• Benchmarking
• Load Testing
• Cost / Budget
• Monitoring and Notification
Utilization vs Provisioned capacity
76%
November
24%
Proximity and Caching
• Content Delivery Network (CDN) Amazon

CloudFront
• Database Caching
Amazon
• Reduce Latency ElastiCache
• Pro-active Monitoring and Notification

RDS DB
instance read
replica
Amazon CloudFront (CDN)
• Cache content at the edge for

faster delivery
• Lower load on origin
• Dynamic and static content
• Streaming video
• Custom SSL certificates
• Low TTLs
Asynchronous patterns
Message passing
Listener A B
Queue
SNS, SQS, Redis, RabbitMQ
Pub-Sub A B
Queue
Async. Architecture (part 1)
API: {DO foo} API: {JobID: 0001}
API API API

Instance Instance Instance
PUT JOB: {JobID: 0001, Task: DO foo}
Web
Instances Queue
GET JOB: {JobID: 0001, Task: DO foo}
Result:
{
JobID: 0001,
Result: bar
Worker Worker }
ElastiCache
Instance Instance
Async. Architecture (part 2)
API API API

Push Notification
Queue
User
Amazon SNS
Worker Worker
ElastiCache
Instance Instance
Full Decoupling
Amazon Amazon
User Route 53 Cloudfront
Elastic Load
Balancer
Web Web
Instance Instance Worker
Instance Amazon S3
Worker
Instance
Queue Amazon SNS
RDS DB Instance ElastiCache
Active (Multi-AZ)
Availability Zone
Event-driven patterns
Event driven
Event on B by A triggers C A B C
How Lambda works
Invoked in response to events Access any service,

- Changes in data including your own
- Changes in state
Any custom
S3 event DynamoDB Kinesis

notifications Streams events
Such as…
SNS DynamoDB Lambda

SNS CloudTrail Cognito
Lambda functions
events events events
Redshift Kinesis S3
Custom CloudWatch
events events
Event-driven using Lambda
S3: AWS Lambda: S3:

Source Bucket Resize Images Destination Bucket
Triggered on
PUTs
Users upload photos
Databases
Read / Write Sharding
App App App

RDS DB Instance RDS DB Instance RDS DB Instance RDS DB Instance

Master (Multi-AZ) Read Replica Read Replica Read Replica
Database Federation
App App App

Products Users
DB DB
Database Sharding
User ShardID
002345 A
App App App
Instance Instance Instance 002346 B
002347 C
002348 B
002349 A
A B C
Specialized Database
NoSQL Graph DB
Database specialization example: Redis
In-memory data structure store, used as a database, cache and message

broker.
Specialized in data structures such as
• string
• hashes
• lists
• sets
• sorted sets with range queries
• bitmaps
• hyperloglogs
• geospatial indexes with radius queries
Cost optimization pillar
Assess your ability to avoid or eliminate unneeded costs or suboptimal
resources, and use those savings on differentiated benefits for your business
Analyze and attribute Managed services to Adopt a consumption

expenditure reduce TCO model
Benefits from Stop spending money on

economies of scale data center operations
Pricing Model
• On Demand
• Reserved
• Spot
• Dedicated
Auto Start/Shutdown of Instances
Amazon
Cloudwatch
Sleep trigger
Rules: every day at 21h30
Wakeup trigger
Rules: every day at 6h15
AWS Lambda AWS Resources
(EC2 instances)
Managed Services
• Let AWS do the heavy lifting.
• Databases, caches and big data solutions.
• Application Level Services.
Amazon Amazon Amazon Amazon AWS Amazon

RDS DynamoDB Redshift ElastiCache Elastic Elasticsearch
Beanstalk Service
Manage Expenditure
• Tag Resources
• Track Project Lifecycle
• Profile Applications vs Cost
• Monitor Usage & Spend

Auto Tagging resources as they start
Events: Amazon AWS Lambda EC2 Instances

RunInstances Cloudwatch Tag:
Owner = userName
PrincipalId = aws:userid
Operational excellence pillar
Operational excellence pillar
Operational practices and procedures used to manage production workloads
Perform operations Align operations processes Make regular, small,

with code to business objectives incremental changes
Test for responses to Learn from operational Keep operations

unexpected events events and failures procedures current
Infrastructure-as-code workflow
version code
code integrate
control review
“It’s all software”

• Create templates of your infrastructure.
• Version control/replicate/update templates like code.
• Integrates with development, CI/CD, management
tools AWS CloudFormation
Some tips … from my own experience
• Architecture as code – code everything.

• Automate everything: “Invest time to save time”
• Don’t reinvent the wheel; managed services are your best friends.
• Embrace security early on.
• Test your DR strategy regularly.
• Serverless architectures free you from managing infrastructure.
• Did I mention automation?
The “Must” from Day 1
Operational Excellence
• High quality code

• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
And don’t forget …
Trusted Advisor
Resources
https://aws.amazon.com/well-architected/
Questions?
Twitter: @adhorn
Email: adhorn@amazon.com

Being Well-Architected in The Cloud

Uploaded by

Copyright:

Available Formats

Being Well-Architected in The Cloud

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Being Well-Architected in The Cloud

Uploaded by

Copyright:

Available Formats

Being Well-Architected in the Cloud

Adrian Hornsby, Technical Evangelist @ AWS

1. What is the Well-Architected framework

Scaling to demand Faster recovery High availability Automation

Security Reliability Performance Cost optimization Operational

Availability Availability Availability

Security at all layers Enable traceability

Implement a principle Focus on securing Automate security

Operating System, Network, and Firewall Configuration

AWS Foundation Services

Compute Storage Database Networking

• Enforce MFA for everyone from day 1.

1: Create EC2 role

3: App retrieves credentials

4: App accesses AWS resource(s)

• Security groups are built-in stateful firewalls

• Divide layers of the stack into subnets

• Use a bastion host for access

• Implement host based controls

WEB WEB Bastion

• Capture & audit AWS CloudTrail, Amazon VPC and

Amazon Virtual AWS AWS Key AWS AWS

Monitor & Alarm

Test recovery Automatically Scale horizontally to Stop guessing

• No Single Point of Failure

• Multiple Availability Zones

• Auto Scaling and Healing

RDS DB Instance RDS DB Instance

Sunday Monday Tuesday Wednesday Thursday Friday Saturday

• Maintain your Amazon EC2 instance availability

RDS DB Instance RDS DB Instance

• Periodic Recovery Testing

Democratize Go global in Use the right Experiment more

• Content Delivery Network (CDN) Amazon

• Pro-active Monitoring and Notification

• Cache content at the edge for

SNS, SQS, Redis, RabbitMQ

API API API

PUT JOB: {JobID: 0001, Task: DO foo}

GET JOB: {JobID: 0001, Task: DO foo}

API API API

Invoked in response to events Access any service,

S3 event DynamoDB Kinesis

SNS DynamoDB Lambda

S3: AWS Lambda: S3:

App App App

RDS DB Instance RDS DB Instance RDS DB Instance RDS DB Instance

App App App

In-memory data structure store, used as a database, cache and message

Analyze and attribute Managed services to Adopt a consumption

Benefits from Stop spending money on

• Let AWS do the heavy lifting.

• Databases, caches and big data solutions.

• Application Level Services.

Amazon Amazon Amazon Amazon AWS Amazon

• Track Project Lifecycle

• Profile Applications vs Cost

• Monitor Usage & Spend

Events: Amazon AWS Lambda EC2 Instances