5 Challenges To Achieving Observability at Scale

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

5 challenges to achieving

observability at scale
Using automation and intelligence
to overcome obstacles.
What teams are up against

6
Challenge One
Introduction The complexity of dynamic multi-cloud environments

10
Challenge Two

Successful digital transformation requires every application


Monitoring dynamic microservices and containers
and digital service, and the dynamic multi-cloud platforms
in real time
they run on, to work perfectly. All the time.

14
Challenge Three
But these dynamic, highly distributed cloud-native
The volume, velocity, and variety
technologies are fundamentally different than their of data and alerts
predecessors. The resulting complexity brought on by
microservices, containers, and software-defined cloud

18
Challenge Four
infrastructure is overwhelming at web scale. It’s all
beyond the limits of human teams to manage and scale Siloed Infra, Dev, Ops, Apps, and Biz teams
on their own.

22
To understand everything going on in these ever-changing Challenge Five
environments, all of the time, observability needs to scale.
Knowing what efforts drive positive business impact

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 2


More tools aren’t
the answer
Some teams mistakenly try to solve this ‘observability at scale’
problem by adopting more siloed monitoring tools — and spending
of applications in enterprise
more time on manual configuration, incurring more technical debt,

95%
and struggling to identify issues and prioritize efforts with the organizations are not monitored
greatest impact.
due to siloed tools and burdensome
As cloud complexity continues, this approach becomes increasingly manual effort.
unsustainable for even the most experienced teams, who are
continuously bogged down in manual-intensive tasks that decrease
effectiveness to achieving what matters most.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 3


The shift to intelligent observability

To scale observability, enterprise organizations must fundamentally


transform the way they work to innovate faster, keep up with constantly
changing tech stacks, and reduce risk across teams.

This scale happens when teams shift from simply observing and reacting to issues

UX
as they arise, to a culture of proactive understanding and optimization. This

logs
unlocks the ability to anticipate, predict and even auto-remediate problems that

traces
matter most to the business.

metrics
In deciding how to accelerate digital transformation, companies need to
understand that every decision is an investment in achieving the original goal
of observability: to proactively and efficiently improve user experiences
that drive business growth.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 4


Automation and intelligence are essential
Whether selecting a DIY approach, buying another cheap tool, or investing in a strategic platform, everything costs time, money, people,
and quality. Prioritizing value and speed of delivery to the business and customers is paramount to finding success in this dynamic multi-cloud world.

Automation and intelligence are essential to transform how teams work to quickly and efficiently achieve observability at enterprise scale.

Requirements Results

Complete coverage More productivity and time to innovate

Automation everywhere Higher quality releases

Real-time feedback Better customer experiences

Precise answers Reduced risk

Cross-collaboration Accelerated business outcomes

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 5


Challenge 1:
The complexity of dynamic HYBRID SERVERLESS MICROSERVICES IOT

multicloud environments
MULTI-CLOUD + CONTAINERS

? ?

The rate at which new technologies are available and implemented is increasing,
exploding the complexity that results from unmanageable volumes and speed
? ?
of data emitted by dynamic environments. ? ?

This makes it near impossible for IT teams to manually understand how


everything is related in context, all of the time. So, teams must find ways
to automate the understanding of this data and context to accelerate
digital transformation.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 6


Teams often fail at digital transformation because they’re:

44%

of an IT teams’
Hindered by Lacking understanding Forced to prioritize time is spent
disconnected data silos and context of upstream manual instrumentation on manual tasks,
that prevent understanding and downstream system and mundane tasks over on average.
of entity relationships impacts from potential developing new features
— Dynatrace 2020 Global CIO Report
and interdependencies changes

These shortcomings introduce unnecessary risk and burden developers with repetitive toil,
ultimately hurting digital transformation efforts and driving innovation forward.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 7


ns
tio
ica
pl
Ap

How to overcome it r v ice


s
Se

Automation is an absolute necessity to not only handle the scale of every single
component in an enterprise ecosystem, but also understand all the interdependencies.
es
ess
oc
Pr
You can’t hire your way to observability at scale. Understanding dynamic multicloud
environments requires an automated approach that can multiply productivity of your
existing team and shift effort from manual tasks to driving tangible business results.
osts
H

rs
n te
e
t ac
Da

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 8


To scale observability and eliminate blind spots across increasingly complex
and expanding environments, teams need automation powered by:

Topology mapping Auto-discovery No-code approach


that continuously maps of new components to better leverage skilled
components, cloud to prevent gaps in coverage developers on proactive
services, and ever-changing in real-time optimization efforts
relationships between and business-driving
potentially billions innovation projects
of interdependencies

This continuous automation and always-on context gives teams confidence in keeping up
with dynamic technology stacks to digitally transform faster, without the ongoing burden
of constant deployments and manual maintenance in attempt to slowly gain more coverage and understanding.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 9


Challenge 2:
Monitoring dynamic microservices
and containers in real-time

Short-lived containers and microservices, like those managed in Kubernetes,


provide the required speed and agility to successfully modernize. However,
the dynamic nature of technologies that can spin up and down within seconds
introduces several major issues to scaling observability for these technologies.

This all results in a lack of understanding of internal states of the application,


other interdependent components that microservices rely on, and even the
impact on users.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 10


IT teams are still blind to what’s happening in their dynamic environments
and actioning on incomplete data because they:

70%

Don’t understand Can’t connect end-to-end Lack real-time visibility of CIOs say
the relationships tracing from real users into exactly what’s inside monitoring containerized
between containers accessing these microservices, the workloads running microservices in real-time
and upstream components to the nodes, the services within containers is almost impossible
that can impact them and containers they run on
— Dynatrace 2020 Global CIO Report

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 11


How to overcome it
Enterprises need observability to scale across their multicloud, including cloud,
legacy, and hybrid environments, to handle the dynamic nature of Kubernetes
and containers.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 12


To ensure everything’s accounted for, no matter how short lived,
teams need real-time intelligence and automation with:

Automatic discovery Topology context Full-stack visibility


of containers at start-up, external to containers, all the way from the pod,
along with all things running since anomalies often occur through the cloud provider
inside each workload outside of Kubernetes and application, to the user
nodes, pods, containers, to understand the end-to-end
and clusters business impact

With this speed, automation, and context applied to containers and microservices,
IT teams can continuously understand system behavior and the true origin of anomalies
can be easily isolated and precisely pinpointed at scale.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 13


Challenge 3:
The volume, velocity, and variety
of data and alerts

Dynamic multi-cloud environments are exponentially increasing the amount


of telemetry data emitted, and overwhelmed teams are still stuck trying
to monitor every data point and make sense of it all.

Already constrained IT resources are stuck reacting to each new problem


after users and business goals are already impacted, trying to observe what’s
happening by manually building, maintaining, and constantly watching
potentially thousands of dashboards.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 14


However, this approach doesn’t scale and persists the same challenges
that cannot be solved using the same manual-intensive philosophy:

Defining and redefining Monitoring “unknown Siloed data sending Multiple teams struggling
“normal” for anomaly unknowns” — issues mixed signals that multiply to pinpoint issues across
thresholds that constantly you aren’t aware of, alert storms, intensifying team different tools to guess the root
change with dynamic don’t understand, fatigue and unnecessary cause, causing more finger-
environments and seasonality and don’t monitor war rooms pointing and blaming

All of this forces teams to spend even more of their time “keeping the lights on”
by guessing about the problem, priority, and diagnosis, rather than continuously optimizing
and resolving issues before users are impacted.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 15


How to overcome it
Bu
sin
It’s clear that AI is needed to continuously and instantly understand when es
si
mp
and why anomalies occur. But the only way to transform from reactive ac
t
Ro
ot
to proactive, is having an AI that doesn’t need to learn or be trained. c au
se

Because dynamic multi-cloud environments can change within seconds, AI needs De


pl
oy
to know precise answers and be able to anticipate and auto-remediate issues m en
th
ist
before business impact. Pr or
ob y
lem
ev
olu
tio
n

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 16


A few critical capabilities of AI
that enable observability at scale:

Auto-adaptive threshold Intelligent grouping Always-on causation-based Integrating answers


baselining for anomaly of related anomalies AI with code-level analysis with context from external
detection to prioritize into a single problem that processes billions of systems (like ServiceNow
what really matters to eliminate redundant work dependencies with complete and other ITSMs) to broaden
across teams fault tree analysis to instantly workflow automation across
deliver answers multiple teams

The goal of causation-based AI is to provide answers to engineering, infrastructure, operations, and application teams
and empower them to focus on the things that matter. Delivering one precise answer for each issue that
everyone understands can transform teams away from finger-pointing to efficient cross-team
collaboration that drives business outcomes.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 17


Challenge 4:
Siloed Infra, Dev, Ops, Apps,
APM
AIOps

and Biz teams Legacy

Log

DEM

New cloud-native technologies require more solutions to instrument


and monitor, but teams are already drowning in tool sprawl. This tool Network
sprawl aggravates silos that hurt innovation, decrease software quality,
and reduce collaboration.

Infrastructure

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 18


Each different tool and point solution amplifies these silos,
with the negative effects spreading across each team that continues to struggle
identifying and resolving issues and optimizations with the highest impact.

Data Environments Platforms Teams

Lack of connective Isolated observability Multiple tools When each team


tissue inflicts and monitoring for multi or hybrid receives alerts
time-consuming across pre-prod cloud platforms and symptoms
and error-prone and production create observability in a vacuum,
joining of disparate environments hurt blind spots for problems and blame
data models speed and quality of infrastructure and are passed “over the
‘shift-left’ efforts for platform operators wall” to others
DevOps and SRE teams

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 19


How to overcome it
To eliminate these silos, a solution can’t simply stitch it all together. It has to
bring together teams through a single common language. Bridging these gaps
with a single source of truth removes confusion and multiplies productivity
across teams.

This cross-team collaboration and more efficient working environment boosts


the speed of value-add product features and optimizations that drive better
user experiences.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 20


Several key requirements enable teams to collaborate more efficiently
towards the same technical and business SLIs/SLOs:

Single data model Shared context that Seamlessly tying together


to scale observability facilitates cross-team the entire software lifecycle
across all layers and collaboration, with from feature development,
components across flexibility to slice and testing, releases, and ongoing
the full tech stack dice across infrastructure, optimizations to innovate
applications, operations, faster with higher quality
and business data

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 21


Challenge 5:
Knowing which efforts drive
positive business impact

Even with complete visibility to back-end components, a lack of front-end


user perspective diminishes much of the tangible value that organizations
aim to achieve with observability efforts.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 22


Without front-end application performance,
major risks to the user experience are exposed:

Disconnected front-end Critical blind spots Disparate solutions No consideration


and back-end perspectives, like mobile app crashes, to attempt observability of employees working
hurting understanding of 3rd party services, CDN, for mobile and edge-device from home, potentially
technology’s impact on users and front-end errors channels, forcing teams damaging their ability
and business objectives still exist to leave some to access required resources
applications ignored they need to deliver frictionless
customer experiences

Neglecting the end-user experience of applications obstructs the ability


to prioritize optimizations and issues based on greatest business impact.
When teams only look at technology by itself, IT efforts may not align with business priorities.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 23


How to overcome it
An outside-in user perspective of the application is needed to create
a feedback loop from back-end technology teams to product, digital,
and business teams, ensuring the entire cloud stack is supporting
expected outcomes.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 24


To include user experience into a more intelligent observability approach,
organizations need to connect front and back-end perspective to gain:

Complete insight Observability and All-in-one platform


of technology’s impact monitoring across web, to optimize end-user
on user experience mobile, and IoT to gain experience for both customers
and business KPIs like understanding to holistic user and employees, no matter
revenue, conversions, experience across channels where they are in the world
and feature adoption

To achieve observability that scales across channels, customers, employees, and all types of applications, 
back-end and front-end application performance must be connected. Only then can teams across IT, product,
and business prioritize and align efforts that drive the bottom line.

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 25


Conclusion

To achieve observability at scale for dynamic multi-cloud environments


at the speed needed to exceed customer expectations and business goals,
a fundamentally different approach is required.

Continuing to waste effort on manual instrumentation and configuration,


digging through siloed data, and working on the wrong things prevents teams
from making progress, and ultimately from achieving strategic business goals.

Automated and intelligent observability is needed.

Dynatrace helps transform the way you work with:

Intelligent observability — See it all down to code-level, at scale

Continuous automation — Stay ahead of modern, dynamic multi-clouds

Precise Intelligence — Go from guessing to knowing

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 26


Our smarter approach to observability
helps teams turn AI into ROI, and drive:

99% 20% 75%


Fewer IT tickets Higher cart value Faster innovation
delivered
From 700 tickets a week Order-from-table mobile
With 75% MTTR and 4x
to just 7. application drives higher
productivity increase.
value than order from bar.

Learn more Learn more Learn more

5 Challenges to Achieving Observability at Scale ©2020 Dynatrace 27


Software intelligence for the enterprise cloud

Click the link to take the next step in your digital journey
and see what we can do for you.

Learn more If you’re ready to learn more, please visit dynatrace.com/platform for assets, resources, and a free 15-day trial.

About Dynatrace
Dynatrace provides software intelligence to simplify cloud complexity and accelerate digital transformation. With automatic and intelligent observability at scale, our all-in-one platform delivers precise answers about the performance of applications, the underlying
infrastructure and the experience of all users to enable organizations to innovate faster, collaborate more efficiently, and deliver more value with dramatically less effort. That’s why many of the world’s largest enterprises trust Dynatrace® to modernize and
automate cloud operations, release better software faster, and deliver unrivaled digital experiences.

dynatrace.com blog @dynatrace

12.17.20 10603_EBK_cs ©2020 Dynatrace

You might also like