0% found this document useful (0 votes)

20 views34 pages

Algorithms Pseudocode

The document discusses performance debugging for distributed systems composed of black box components, focusing on isolating performance bottlenecks caused by complex interactions. It presents two algorithms: the nesting algorithm, which analyzes RPC-style communications, and the convolution algorithm, applicable to broader message-based systems. Experimental results demonstrate the effectiveness of these algorithms in identifying high-impact causal paths and visualizing performance issues.

Uploaded by

HARSH VITHLANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views34 pages

Algorithms Pseudocode

Uploaded by

HARSH VITHLANI

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 34

Performance

Debugging for
Distributed
Systems of
Black Boxes
Haiyong Xie
Yinghua Wu,
Outline
 Problem statement & goals
 Overview of our approach

 Algorithms
 The nesting algorithm (RPC)
 The convolution algorithm (RPC or free-form)

 Experimental results
 Visualization GUI

 Related work

 Conclusions
Motivation
 Complex distributed systems
 Built from black box components

 Heavy communications traffic

 Bottlenecks at some specific nodes

 These systems may have performance problems

 High or erratic latency

 Caused by complex system interactions

 Isolating performance bottlenecks is hard

 We cannot always examine or modify system

components
 We need tools to infer where bottlenecks are

 Choose which black boxes to open

Example multi-tier
system
client client

web server web server web server

authentication
application application
server
server server

100ms

database database
server server
Goals
 Isolating performance bottlenecks
 Find high-impact causal path patterns
 Causal path: series of nodes that sent/received
messages. Each message is caused by receipt of
previous message, and Some causal paths occur
many times
 High-impact: occurs frequently, and contributes
significantly to overall latency
 Identify high-latency nodes on high-impact
patterns
 Add significant latency to these patterns

Then What should We do?

------- Messages Trace is enough
The Black Box

Complex Performance
distributed system bottlenecks
built from “black
boxes”
 Desired properties
 Zero-knowledge, zero-
instrumentation, zero-perturbation
 Scalability
 Accuracy
 Efficiency (time and space)
Outline
 Problem statement & goals
 Overview of our approach

 Algorithms
 The nesting algorithm (RPC)
 The convolution algorithm (RPC or free-form)

 Experimental results
 Visualization GUI

 Related work

 Conclusions
Overview of Approach
 Obtain traces of messages between components
 Ethernet packets, middleware messages, etc.
 Collect traces as non-invasively as possible
 Require very little information:
[timestamp, source, destination, call/return, call-id]

 Analyze traces using our algorithms

 Nesting: faster, more accurate, limited to RPC-style
systems
 Convolution: works for all message-based systems

 Visualize results and highlight high-impact paths

Recap. causal path
client client

web server web server web server

authentication
application application
server
server server

100ms

database database
server server
Challenges
 Trace contain interleaved messages
from many causal paths
 How to identify causal paths?
 Causality trace by Timestamp

 Want only statistically significant

causal paths
 How to differentiate significance?
 It is easy! They appear repeatedly
Outline
 Problem statement & goals
 Overview of our approach

 Algorithms
 The nesting algorithm (RPC)
 The convolution algorithm (RPC or free-form)

 Experimental results
 Visualization GUI

 Related work

 Conclusions
The nesting algorithm
 Depends on RPC-style communication
 Infers causality from “nesting” relationships by
message timestamps
 Suppose A calls B and B calls C before returning to A
 Then the BC call is “nested” in the AB call
 Uses statistical correlation

node A node B node C

call

call
time

return

return
Nesting: an example causal
path
Consider this system of 4 nodes
Looking for internal delays at each node

node A node B node C node D

C call

call

time
A B
return
call
D
return

return
Steps of the nesting
algorithm
1. Pair call and return messages
 (AB, BA), (BD, DB), (BC, CB)
2. Find and score all nesting relationships
 BC nested in AB node A node B node C node D
 BD also nested in AB call

3. Pick best parents call

 Here: unambiguous

time
4. Reconstruct call paths return
 AB[C ; D] call

return
O(m) run time
return
m = number of messages
Pseudo-code for the
nesting algorithm
 Detects calls pairs and find all possible
nestings of one call pair in another
 Pick the most likely candidate for the
causing call for each call pair
 Derive
procedure call paths from the causal
FindCallPairs
for each trace entry (t1, CALL/RET, sender A, receiver B, callid id)
relationships
case CALL:procedure ScoreNestings
procedure FindCallPaths
for each child (B, C, t2, t3) in Tcallpairs
store (t1,CALL,A,B,id) in Topencalls
for each parent (A, B, initialize
t1, t4) inhash table Tpaths
child.parents
case RETURN:
scoreboard[A,
find matching B, C,for
entry (t2, CALL,
each+=
t2-t1]
B, A, id)
callpair (A, B, t1, t2)
in(1/|child.parents|)
Topencalls
if callpair.parents = null then
if match is found then
procedure FindNestedPairs root := { CreatePathNode(callpair, t1) }
remove entry from Topencalls
for eachwith
child (B; C;messageif root
t2; t3) is in
intimestamp
call Tpaths then update its latencies
pairs
update entry return t2
maxscore := 0 else add root to Tpaths
add entry to Tcallpairs
for each
entry.parents := {all B, t1,function
p (A,callpairst4) CreatePathNode(callpair (A, B, t1, t4), tp)
in CALL,
(t3, child.parents
X, A, id2) in Topencalls with
node := new node with name B
score[p] := scoreboard[A, B, C, t2-t1]*penalty
t3 < t2}
if (score[p] > maxscore)node.latency
then := t4 - t1
maxscore := score[p] node.call_delay := t1 - tp
parent := p for each child in callpair.children
node.edges U
parent.children := parent.children :={child}
node.edges U
{ CreatePathNode(child, t1)}
Inferring nesting
 An example of Parallel calls
−Local info not enough
−Use aggregate info node A node B node C
−Histograms keep track
of possible latencies t1
t2
−Medium-length delay t3

time
will be selected t4
−Assign nesting
−Heuristic methods
Outline
 Problem statement & goals
 Overview of our approach

 Algorithms
 The nesting algorithm
 The convolution algorithm

 Experimental results
 Visualization GUI

 Related work

 Conclusions
The convolution
algorithm
 “Time signal” of messages for each
<source node, destination node>
A sent message to B at times 1,2,5,6,7

1 2 3 4 5 6 7 time

S1(t)= AB messages

The convolution
 Look
algorithm
for time-shifted similarities
 Compute convolution X(t) = S2(t)  S1(t)
 Use Fast Fourier Transforms
S1(t)
(AB)

S2(t) Peaks in X(t) suggest

(BC) causality between
AB and BC
X(t)
Time shift of a peak
indicates delay
Convolution details
 Time complexity: O(em+eVlogV)
m = messages
 e = output edges

 V = number of time steps in trace

 Need to choose time step size

 Must be shorter than delays of interest
 Too coarse: poor accuracy

 Too fine: long running time

 Robust to noise in trace

Algorithm comparison
 Nesting

 Looks at individual paths and then aggregates

 Finds rare paths

 Requires call/return style communication

 Fast enough for real-time analysis

 Convolution

 Applicable to a broader class of systems

 Slower: more work with less information

 May need to try different time steps to get good

results
 Reasonable for off-line analysis
Summarize

Nesting Algorithm Convolution Algorithm

Communication
RPC only RPC or free-form messages
style

Rare events Yes, but hard No

<timestamp, sender, receiver>
Level of
+ <timestamp, sender, receiver>
Trace detail call/return tag
Time and space Linear space Linear space
complexity Linear time Polynomial time

Visualization RPC call and return combined Less compact

 More compact
Outline
 Problem statement & goals
 Overview of our approach

 Algorithms

 Experimental results

 Maketrace: a trace generator

 Maketrace web server simulation

 Pet Store EJB traces

 Execution costs

 Visualization GUI

 Related work

 Conclusions
Maketrace
 Synthetic trace generator
 Needed for testing
 Validate output for known input
 Check corner cases
 Uses set of causal path templates
 All call and return messages, with latencies
 Delays are x ± y seconds, Gaussian normal
distribution
 Recipe to combine paths
 Parallelism, start/stop times for each path
 Duration of trace
Desired results for one
 Causal
trace
paths
 How often
 How much time spent

 Nodes
 Host/component name
 Time spent in node

and all of the nodes it

calls
 Edges
 Timeparent waits
before calling child
Measuring Added Delay
 Added 200msec
delay in WS2
 The nesting
algorithm detects
the added delay,
and so does the
convolution
algorithm
Results: Petstore
 Sample EJB application
 J2EE middleware for

Java
 Instrumentation from

Stanford’s PinPoint
project
 50msec delay added in

mylist.jsp
Results: running time
Trace Length Duration Memory CPU time
(messages) (sec) (MB) (sec)
Nesting
Multi-tier (short) 20,164 50 1.5 0.23
Multi-tier 202,520 500 13.8 2.27
Multi-tier (long) 2,026,658 5,000 136.8 23.97
PetStore 234,036 2,000 18.4 2.92
Convolution (20 ms time step)
PetStore 234,036 2,000 25.0 6,301.00
More details and results in paper
Accuracy vs. parallelism
 Increased parallelism degrades accuracy slightly
 Parallelism is number of paths active at same time
false positives

60
40
20
0
0 100 200 300 400 500
parallelism per node
Other results for nesting
 Clock skew algorithm
 Little
effect on accuracy with skew ≤ delays
of interest
 Drop rate
 Little effect on accuracy with drop rates ≤
5%
 Delay variance
 Robust to ≤ 30% variance
 Noise in the trace
 Only matters if same nodes send noise
 Little effect on accuracy with ≤ 15% noise
Visualization GUI
 Goal: highlight
dominant paths
 Paths sorted

 By frequency

 By total time

 Red highlights

 High-cost

nodes
 Timeline

 Nested calls

 Dominant

subcalls
 Time plots

 Node time

 Call delay
Related work
 Systems that trace end-to-end causality via
modified middleware using modified JVM or
J2EE layers
 Magpie (Microsoft Research), aimed at
performance debugging
 Pinpoint (Stanford/Berkeley), aimed at locating
faults
 Products such as AppAssure, PerformaSure,
OptiBench
 Systems that make inferences from traces
 Intrusion detection (Zhang & Paxson, LBL) uses
traces + statistics to find compromised systems
Future work
Automate trace gathering and
conversion
Sliding-window versions of algorithms

 Find phased behavior

 Reduce memory usage of nesting

algorithm
 Improve speed of convolution algorithm

Validateusefulness on more
complicated systems
Conclusions
 Looking for bottlenecks in black box systems
 Finding causal paths is enough to find

bottlenecks
 Algorithms to find paths in traces really

work
 We find correct latency distributions

 Two very different algorithms get similar

results
 Passively collected traces have sufficient

information

Dist Sys Slides
No ratings yet
Dist Sys Slides
516 pages
D-UN-DY-23 Exam Updated Practice Questions 2025
No ratings yet
D-UN-DY-23 Exam Updated Practice Questions 2025
15 pages
Slides
No ratings yet
Slides
516 pages
Distributed Debugging
No ratings yet
Distributed Debugging
13 pages
研究生院第十章-过程间分析2
No ratings yet
研究生院第十章-过程间分析2
102 pages
User'S Manual: Thermal Transfer / Direct Thermal Label Printer
No ratings yet
User'S Manual: Thermal Transfer / Direct Thermal Label Printer
35 pages
5 - Graph Coverage
No ratings yet
5 - Graph Coverage
101 pages
Lecture 10
No ratings yet
Lecture 10
55 pages
Scopus
No ratings yet
Scopus
57 pages
3 Synchronization
No ratings yet
3 Synchronization
93 pages
DC Unit 3
No ratings yet
DC Unit 3
44 pages
Slides 04
No ratings yet
Slides 04
54 pages
BDS Session 6
No ratings yet
BDS Session 6
53 pages
Unit 1-Chapter 1&2 - Basics of Programming
No ratings yet
Unit 1-Chapter 1&2 - Basics of Programming
61 pages
slides.04
No ratings yet
slides.04
43 pages
Se 12 (Pert)
No ratings yet
Se 12 (Pert)
50 pages
Latency Critical Tracing: Through
No ratings yet
Latency Critical Tracing: Through
40 pages
HPC Computer Engg Sem 8 Notes
No ratings yet
HPC Computer Engg Sem 8 Notes
36 pages
Network Infrastructure of A Company Project Report
100% (1)
Network Infrastructure of A Company Project Report
28 pages
CritPath-Updated P - Sachapman
No ratings yet
CritPath-Updated P - Sachapman
52 pages
Advantages of A PROFINET-Switch - Version 4 EN
No ratings yet
Advantages of A PROFINET-Switch - Version 4 EN
6 pages
BLACK BOX - Test
No ratings yet
BLACK BOX - Test
33 pages
3 Synchronisation and Coordination
No ratings yet
3 Synchronisation and Coordination
119 pages
Lecture 9 - RPC and Concurrency Control
No ratings yet
Lecture 9 - RPC and Concurrency Control
29 pages
Pdc - Co1-Basic Op & Cost Analysis
No ratings yet
Pdc - Co1-Basic Op & Cost Analysis
22 pages
IST2045-GROUP 11(1)
No ratings yet
IST2045-GROUP 11(1)
23 pages
Lecture 7 Disributed Algorithms
No ratings yet
Lecture 7 Disributed Algorithms
43 pages
CTE 113
No ratings yet
CTE 113
31 pages
CS5002NI WK01 L IntroductiontoSoftwareEngineering 93444
No ratings yet
CS5002NI WK01 L IntroductiontoSoftwareEngineering 93444
35 pages
003 Abstractions
No ratings yet
003 Abstractions
22 pages
Unit-3 - Hardware and Software
No ratings yet
Unit-3 - Hardware and Software
9 pages
Network Security v1.0 - Lecture 2
No ratings yet
Network Security v1.0 - Lecture 2
44 pages
Scott Chapman - American Electric Power Paper 9015 Session 331
No ratings yet
Scott Chapman - American Electric Power Paper 9015 Session 331
52 pages
1904050001
No ratings yet
1904050001
119 pages
Dos Notes
No ratings yet
Dos Notes
18 pages
Event Sourced Saga
No ratings yet
Event Sourced Saga
6 pages
11 Distributed1
No ratings yet
11 Distributed1
42 pages
dscc QB solution copy
No ratings yet
dscc QB solution copy
15 pages
Cluster Computing: Dr. C. Amalraj 07/06/2021 The University of Moratuwa Amalraj@uom - LK
No ratings yet
Cluster Computing: Dr. C. Amalraj 07/06/2021 The University of Moratuwa Amalraj@uom - LK
37 pages
E- Notes -HPC-Unit 3-1
No ratings yet
E- Notes -HPC-Unit 3-1
26 pages
Spring 23-24 Os Lab 8
No ratings yet
Spring 23-24 Os Lab 8
16 pages
Exam Notes
No ratings yet
Exam Notes
16 pages
03 Communication PDF
No ratings yet
03 Communication PDF
72 pages
SOFT3406 Week11 2
No ratings yet
SOFT3406 Week11 2
30 pages
lecture8-DistributedSystem
No ratings yet
lecture8-DistributedSystem
27 pages
2016 DistributedSystems 1B L4
No ratings yet
2016 DistributedSystems 1B L4
24 pages
Implementing Remote Procedure Calls
No ratings yet
Implementing Remote Procedure Calls
14 pages
Distributed Systems Lab.docx_20241205_084733_0000
No ratings yet
Distributed Systems Lab.docx_20241205_084733_0000
25 pages
Chapter 4 Communication
No ratings yet
Chapter 4 Communication
25 pages
Practical File VK Vikram Xiie 39-1
No ratings yet
Practical File VK Vikram Xiie 39-1
24 pages
Lecture # 1A - Digital Fundamentals and Analog Quantities
100% (2)
Lecture # 1A - Digital Fundamentals and Analog Quantities
21 pages
Lecture 9-111023
No ratings yet
Lecture 9-111023
20 pages
Performance Metrics For Parallel Programs: 8 March 2010
No ratings yet
Performance Metrics For Parallel Programs: 8 March 2010
44 pages
Remote Procedure Calls: Network Transfer Protocols
No ratings yet
Remote Procedure Calls: Network Transfer Protocols
21 pages
DC PRAC
No ratings yet
DC PRAC
12 pages
Synchronization in Distributed Systems
No ratings yet
Synchronization in Distributed Systems
51 pages
What Does Mean?: Scalable
No ratings yet
What Does Mean?: Scalable
26 pages
Ch04 Communication
No ratings yet
Ch04 Communication
26 pages
CSCE626 Amato LN PerformanceAnalysisMethodology
No ratings yet
CSCE626 Amato LN PerformanceAnalysisMethodology
19 pages
14CS705B-Distributed Systems Scheme
No ratings yet
14CS705B-Distributed Systems Scheme
24 pages
Dot Net Architecture Guide
No ratings yet
Dot Net Architecture Guide
51 pages
Language Translator
No ratings yet
Language Translator
5 pages
Layering Harmful Wakeman Network1992
No ratings yet
Layering Harmful Wakeman Network1992
5 pages
Whitney Slides
No ratings yet
Whitney Slides
16 pages
CS2307 LM
No ratings yet
CS2307 LM
8 pages
Performance Forecasting: Finding Bottlenecks Before They Happen
No ratings yet
Performance Forecasting: Finding Bottlenecks Before They Happen
24 pages
Artificial Intelligence Based Smart Door With Face Mask Detection
No ratings yet
Artificial Intelligence Based Smart Door With Face Mask Detection
6 pages
Iot Intro and Logical Design
No ratings yet
Iot Intro and Logical Design
15 pages
cr805 Card Printer Ds
No ratings yet
cr805 Card Printer Ds
4 pages
CN Syllabus
No ratings yet
CN Syllabus
7 pages
IPC-HDBW3441E-S Datasheet 20210329
No ratings yet
IPC-HDBW3441E-S Datasheet 20210329
3 pages
Fort I Analyzer
No ratings yet
Fort I Analyzer
8 pages
Critical Path: Vazi Okhandiar
No ratings yet
Critical Path: Vazi Okhandiar
13 pages
Lab No. 02 - Advanced Lexical Analyzer: Lab Manual CSI 412 - Compiler Sessional
No ratings yet
Lab No. 02 - Advanced Lexical Analyzer: Lab Manual CSI 412 - Compiler Sessional
2 pages
Valeo Exam Cafc
100% (1)
Valeo Exam Cafc
17 pages
Guidelines and Mechanics For Disassemble - Assemble - Crimping
No ratings yet
Guidelines and Mechanics For Disassemble - Assemble - Crimping
3 pages
Chapter 2 - Communication: Distributed Systems (IT 441)
No ratings yet
Chapter 2 - Communication: Distributed Systems (IT 441)
59 pages
SAS - (Statistical Analysis System)
No ratings yet
SAS - (Statistical Analysis System)
11 pages
Netup Streamer 3.0: Quick Start
No ratings yet
Netup Streamer 3.0: Quick Start
34 pages
10.1.4.8 Lab - Configure ASA 5505 Basic Settings and Firewall Using ASDM
No ratings yet
10.1.4.8 Lab - Configure ASA 5505 Basic Settings and Firewall Using ASDM
40 pages
Terraform-Associate Exam
100% (1)
Terraform-Associate Exam
39 pages
Verifying Layered Protocols: - Leveraging Advanced UVM Capabilities
No ratings yet
Verifying Layered Protocols: - Leveraging Advanced UVM Capabilities
14 pages
Computer Networks Chapter 3 Transport Layer - Part I Notes
No ratings yet
Computer Networks Chapter 3 Transport Layer - Part I Notes
7 pages
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
C Programming
From Everand
C Programming
Netra
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet