Being Well-Architected in The Cloud

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Being Well-Architected in the Cloud

Adrian Hornsby, Technical Evangelist @ AWS

Twitter: @adhorn
Email: adhorn@amazon.com

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Technical Evangelist, Developer Advocate,
… Software Engineer
• Own bed in Finland
• Previously:
• Solutions Architect @AWS
• Lead Cloud Architect @Dreambroker
• Director of Engineering, Software Engineer, DevOps, Manager, ... @Hdm
• Researcher @Nokia Research Center
• and a bunch of other stuff.
• Climber, like Ginger shots.
What to expect from the session

1. What is the Well-Architected framework


2. Framework Overview
3. How to be Well-Architected
4. Conclusion
What is the Well-Architected
Framework?
Customer Challenges

Faster response to change Delivery time Change Management Reduce human errors
in market

Scaling to demand Faster recovery High availability Automation


Download the
Whitepaper
AWS well-architected framework
Set of questions you can use to evaluate how well an architecture is
aligned to AWS best practices

Security Reliability Performance Cost optimization Operational


efficiency excellence
Couple of fundamentals
AWS Global Infrastructure

16
Regions
42 Availability Zones
Building Blocks

Server
Amazon
EC2 instance S3
Subnet

Amazon
CloudWatch

Availability Availability Availability


Zone A Zone B Zone C

Region
Security pillar
Security pillar
Protect information, systems, and assets while delivering business value
through risk assessments and mitigation strategies

Security at all layers Enable traceability

Implement a principle Focus on securing Automate security


of least privilege system best practices
Shared Responsibility

Customers
Customer applications & content
Platform, Applications, Identity & Access Management

Operating System, Network, and Firewall Configuration


Client-side Data Server-side Data Network Traffic
Encryption Encryption Protection

AWS Foundation Services

Compute Storage Database Networking

Availability Zones
AWS Global Edge
Infrastructure Locations
Regions
Credentials

• Enforce MFA for everyone from day 1.


• Use AWS IAM Users and Roles from day 1.
• Enforce strong passwords.
• Protect and rotate credentials.
• No access keys in code.
EC2 Role

1: Create EC2 role


Create role in IAM service with
2: Launch EC2 instance
limited policy
Launch instance with role

3: App retrieves credentials


Instance Using AWS SDK application
retrieves temporary credentials

4: App accesses AWS resource(s)


Using AWS SDK application uses
credentials to access resource(s)
IAM Policies

{
"Version": "2012-10-17",
"Statement":
[
{
"Sid": "AddPerm",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::YOUR_BUCKET_NAME/*"
}
]
}
Network and Boundary

• Security groups are built-in stateful firewalls

• Divide layers of the stack into subnets

• Use a bastion host for access

• Implement host based controls


Layers with Security Groups

User

WEB WEB
Server Security Group

Web Subnet A

DB
Security Group

RDS DB Instance
DB Subnet A

Availability Zone A
Bastion Host & Security Groups

Developer

WEB WEB Bastion


Bastion
Security Group
> start_bastion
Server Security Group Host Port 22
IP restriction > ssh -A
> stop_bastion
Public Subnet A

DB
Security Group

RDS DB Instance
Private Subnet A

Availability Zone A
Monitoring and Auditing

• Capture & audit AWS CloudTrail, Amazon VPC and


Amazon CloudWatch logs.
• Collect all logs centrally.
• Setup alerts.

Amazon Virtual AWS AWS Key AWS AWS


Private Cloud Identity & Management CloudTrail Config
Access Service
Manager
Audit logs for all operations
Store/Archive

Troubleshoot

Monitor & Alarm


Verify everything, always, with AWS Config
Reliability pillar
Reliability pillar
Ability of a system to recover from infrastructure or service disruptions,
dynamically acquire computing resources to meet demand, and mitigate
disruptions such as misconfigurations or transient network issues

Test recovery Automatically Scale horizontally to Stop guessing


procedures recover from failure increase availability capacity
High Availability

• No Single Point of Failure

• Multiple Availability Zones

• Load Balancing

• Auto Scaling and Healing


Multi-AZ Architecture
Available & redundant application
Amazon Amazon
User Route 53 CloudFront

Amazon S3
Load
balancer

Web Web
Instance Instance

RDS DB Instance RDS DB Instance


Active (Multi-AZ) Standby (Multi-AZ)
Availability Zone Availability Zone
Weekly traffic pattern

Sunday Monday Tuesday Wednesday Thursday Friday Saturday


Auto Scaling

• Maintain your Amazon EC2 instance availability


• Automatically Scale Up and Down your EC2 Fleet
• Scale based on CPU, Memory or Custom metrics
Auto scaling groups
Amazon Amazon
User Route 53 CloudFront

Amazon S3
ELB

Auto-Scaling group

Web Web
Instances ElastiCache
Instances

RDS DB Instance RDS DB Instance


Active (Multi-AZ) Standby (Multi-AZ)
Availability Zone Availability Zone
Backup and DR

• Define Objectives

• Backup Strategy

• Periodic Recovery Testing

• Automated Recovery

• Periodic Reviews
Performance efficiency pillar
Performance efficiency pillar
Efficiently use of computing resources to meet requirements, and
maintaining that efficiency as demand changes and technologies evolve

Democratize Go global in Use the right Experiment more


advanced minutes architectures for often
technologies your backend and
databases
Right Sizing

• Reference Architecture
• Quick Start Reference Deployments
• Benchmarking
• Load Testing
• Cost / Budget
• Monitoring and Notification
Utilization vs Provisioned capacity
76%

November
24%
Proximity and Caching

• Content Delivery Network (CDN) Amazon


CloudFront

• Database Caching
Amazon
• Reduce Latency ElastiCache

• Pro-active Monitoring and Notification


RDS DB
instance read
replica
Amazon CloudFront (CDN)

• Cache content at the edge for


faster delivery
• Lower load on origin
• Dynamic and static content
• Streaming video
• Custom SSL certificates
• Low TTLs
Asynchronous patterns
Message passing

Listener A B
Queue

SNS, SQS, Redis, RabbitMQ

Pub-Sub A B
Queue
Async. Architecture (part 1)
API: {DO foo} API: {JobID: 0001}

API API API


Instance Instance Instance

PUT JOB: {JobID: 0001, Task: DO foo}

Web
Instances Queue

GET JOB: {JobID: 0001, Task: DO foo}

Result:
{
JobID: 0001,
Result: bar
Worker Worker }
ElastiCache
Instance Instance
Async. Architecture (part 2)

API API API


Instance Instance Instance

Push Notification
Queue

User
Amazon SNS

Worker Worker
ElastiCache
Instance Instance
Full Decoupling
Amazon Amazon
User Route 53 Cloudfront

Elastic Load
Balancer

Web Web
Instance Instance Worker
Instance Amazon S3

Worker
Instance
Queue Amazon SNS
RDS DB Instance ElastiCache
Active (Multi-AZ)
Availability Zone
Event-driven patterns
Event driven

Event on B by A triggers C A B C
How Lambda works

Invoked in response to events Access any service,


- Changes in data including your own
- Changes in state
Any custom

S3 event DynamoDB Kinesis


notifications Streams events

Such as…

SNS DynamoDB Lambda


SNS CloudTrail Cognito
Lambda functions
events events events

Redshift Kinesis S3

Custom CloudWatch
events events
Event-driven using Lambda

S3: AWS Lambda: S3:


Source Bucket Resize Images Destination Bucket

Triggered on
PUTs
Users upload photos
Databases
Read / Write Sharding

App App App


Instance Instance Instance

RDS DB Instance RDS DB Instance RDS DB Instance RDS DB Instance


Master (Multi-AZ) Read Replica Read Replica Read Replica
Database Federation

App App App


Instance Instance Instance

Products Users
DB DB
Database Sharding

User ShardID
002345 A
App App App
Instance Instance Instance 002346 B
002347 C
002348 B
002349 A

A B C
Specialized Database

NoSQL Graph DB
Database specialization example: Redis

In-memory data structure store, used as a database, cache and message


broker.
Specialized in data structures such as
• string
• hashes
• lists
• sets
• sorted sets with range queries
• bitmaps
• hyperloglogs
• geospatial indexes with radius queries
Cost optimization pillar
Assess your ability to avoid or eliminate unneeded costs or suboptimal
resources, and use those savings on differentiated benefits for your business

Analyze and attribute Managed services to Adopt a consumption


expenditure reduce TCO model

Benefits from Stop spending money on


economies of scale data center operations
Pricing Model

• On Demand

• Reserved

• Spot

• Dedicated
Auto Start/Shutdown of Instances

Amazon
Cloudwatch

Sleep trigger
Rules: every day at 21h30

Wakeup trigger
Rules: every day at 6h15
AWS Lambda AWS Resources
(EC2 instances)
Managed Services

• Let AWS do the heavy lifting.

• Databases, caches and big data solutions.

• Application Level Services.

Amazon Amazon Amazon Amazon AWS Amazon


RDS DynamoDB Redshift ElastiCache Elastic Elasticsearch
Beanstalk Service
Manage Expenditure

• Tag Resources

• Track Project Lifecycle

• Profile Applications vs Cost

• Monitor Usage & Spend


Auto Tagging resources as they start

Events: Amazon AWS Lambda EC2 Instances


RunInstances Cloudwatch Tag:
Owner = userName
PrincipalId = aws:userid
Operational excellence pillar
Operational excellence pillar
Operational practices and procedures used to manage production workloads

Perform operations Align operations processes Make regular, small,


with code to business objectives incremental changes

Test for responses to Learn from operational Keep operations


unexpected events events and failures procedures current
Infrastructure-as-code workflow

version code
code integrate
control review

“It’s all software”


• Create templates of your infrastructure.
• Version control/replicate/update templates like code.
• Integrates with development, CI/CD, management
tools AWS CloudFormation
Some tips … from my own experience

• Architecture as code – code everything.


• Automate everything: “Invest time to save time”
• Don’t reinvent the wheel; managed services are your best friends.
• Embrace security early on.
• Test your DR strategy regularly.
• Serverless architectures free you from managing infrastructure.
• Did I mention automation?
The “Must” from Day 1
Operational Excellence

• High quality code


• Version controlled
• CI/CD pipeline
• Infrastructure as code
• Security at every layer
• Cost conscious
• Test & Monitor everything
• DR procedure
And don’t forget …
Trusted Advisor
Resources
https://aws.amazon.com/well-architected/
Questions?

Twitter: @adhorn
Email: adhorn@amazon.com

You might also like