International Journal of Scientific Research in Science, Engineering and Technology
Print ISSN - 2395-1990
Online ISSN : 2394-4099
Available Online at : www.ijsrset.com
doi : https://doi.org/10.32628/IJSRSET12411594
Automation in Distributed Shared Memory Testing for MultiProcessor Systems
Swethasri Kavuri
Independent Researcher, USA
ARTICLE
INFO
ABSTRACT
This research paper explores the critical domain of automated testing for
Article History:
Accepted: 20 May 2019
Published: 30 May 2019
Distributed
Shared
Memory
(DSM)
systems
in
multi-processor
environments. As the complexity of multi-core and distributed computing
systems continues to grow, ensuring the reliability and performance of
DSM implementations becomes increasingly challenging. This study
Publication Issue :
Volume 6, Issue 3
May-June-2019
Page Number :
508-521
investigates various automated testing strategies, including test generation
techniques, fault injection mechanisms, and concurrency detection
methods. It also examines automated test execution frameworks, real-time
monitoring
solutions, and
advanced
verification
and validation
techniques. The research highlights the challenges faced in DSM testing,
such as scalability issues and non-determinism, and proposes future
directions for research, including the integration of artificial intelligence
and cloud-based testing platforms. The findings of this study contribute to
the advancement of DSM testing methodologies and provide valuable
insights for both researchers and practitioners in the field of distributed
systems and parallel computing.
Keywords: Distributed Shared Memory, Multi-Processor Systems,
Automated Testing, Fault Injection, Concurrency Detection, Formal
Verification, Performance Benchmarking, Parallel Computing
I.
INTRODUCTION
advantages of scale and fault tolerance of a distributed
system (Tanenbaum & van Steen, 2017). It was in the
1.1 Distributed Shared Memory (DSM) Systems
Context
DSM systems are a rather important paradigm for
parallel and distributed computing, providing one
mid-1980's that DSM first came into being. Since then,
it has assumed many complexities to meet the rising
needs of high-performance computing as well as largescale data processing.
uniform memory abstraction, situated physically
across distributed, distributed memory modules. DSM
systems will attempt to combine the programming
simplicity of the shared memory models with the
Copyright © 2024 The Author(s) : This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/)
508
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
This pie chart illustrates the distribution of challenges
in DSM testing, highlighting the relative importance of
each challenge based on the research findings.
1.3 Research Objectives and Scope
1.
This research addresses these objectives:
2.
Critical analysis and evaluation of the approaches
for testing automation tailored specifically for
DSM applications in a multi-processor
environment.
3.
State-of-art test generation techniques and fault
injection methodologies to enable thorough
1.2 Testing of Multi-Processor Systems
The test of multi-processor systems; more particularly
4.
well as real-time monitoring framework to
DSM poses a challenge unto itself. It includes:
I.
II.
manage test activities effectively.
Concurrency issues: Race conditions, deadlocks,
and livelocks are very difficult to identify and
reproduce.
5.
deterministic behaviour.
III. Scalability: The complexity of testing does not
Verification and validation approaches: formal
methods, runtime assertion checking, etc., in
Non-determinism: Interleaving operations across
multiple processors is most likely to lead to non-
testing of DSMs.
Automation of frameworks for test execution as
6.
DSM systems.
Discuss the challenges and limitations resulting
from DSM testing and potential future research
avenues.
grow linearly with the number of processors but
exponentially as the number of processors
This research entails the scope in both software and
increases.
hardware aspects in DSM testing, with a focus on
IV. Memory consistency: Most models of consistency
automated approaches that can be applied to augment
have to be implemented and then maintained
reliability, performance, and scalability of multi-
appropriately throughout the system.
Performance variation: Considering network
processor systems using DSM architectures.
V.
latency and other aspects as well as the protocols
II. THEORETICAL FRAMEWORK
of cache coherence, system performance variation
should be taken into account.
2.1. Distributed Shared Memory Architecture
In DSM architecture, every processor in a distributed
system gets access to a global address space. This
abstraction enables processes resident on different
nodes in sharing data as if there existed a single, shared
memory, even though the memory resides on several
machines (Protic et al., 1996).
DSM systems can be broadly categorized into two
categories: hardware-based and software-based. The
hardware-based DSM systems, in addition to the
Stanford DASH multiprocessor, rely on hardware
support in order to achieve coherence and consistency.
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
509
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
The software-based DSM systems, such as Trademarks
reordering
and Munin, implement the shared memory abstraction
synchronization points, thus performance.
entirely in software in order to enhance flexibility at
•
memory
operations
between
Lazy Release Consistency (LRC): LRC is an
potential loss of performance.
optimization of RC, designed by Keleher et al. In
DSM System Implementation: Key Components
this LRC, modification propagation is delayed
[1]. Memory Management: It performs allocation and
until the next synchronization operation is
deallocation of shared memory region.
[2]. Consistency Protocol: It monitors an operation in
encountered. As a result,
communication overhead.
the memory based on the consistency model, such
as consistency.
[3]. Communication
reduces
Entry Consistency (EC): Bershad and Zekauskas
(1991) proposed EC. It links shared variables to
Subsystem:
This
subsystem
governs passing messages between nodes in order
to transfer data, which makes a node to achieve
synchronization with other nodes.
[4]. Coherence
•
it
Mechanism:
This
synchronization
objects,
whose
consistency
management capabilities were fine-grained.
Table 2: Comparison of these consistency models in
mechanism
characteristics and performance impacts:
performs coherence of shared data across multiple
Consiste
Ordering
Communica
Program
caches.
ncy
Constraints
tion
ming
Overhead
Complexit
Recent developments in DSM architectures are hybrid
systems that integrate shared memory and message
Model
passing paradigms. For instance, the runtime system
provided by Nelson et al. (2015) called Grappa
Sequenti
al
provided a DSM abstraction on commodity clusters
Consiste
with improved performance for irregular applications.
ncy
2.2. Consistency Models DSM Systems
Memory consistency models define what rules govern
which memory operations are ordered and made
y
Strict global High
order
Low
Release
Relaxed
Medium
Medium
Consiste
between
ncy
synchroniza
tion points
contract between the programmer and the system,
Lazy
Further
Low
Medium
stating how memory operations will behave, according
Release
relaxed,
to Adve & Gharachorloo, 1996.
Consiste
delayed
Some common consistency models are:
ncy
propagation
Entry
Fine-
Very Low
High
Lamport in 1979, SC ensures the result of any
Consiste
grained,
execution is the same as what would be produced
ncy
data-centric
visible in a DSM system. These models define a
•
Sequential
Consistency
(SC):
Proposed
by
if all operations of all processors were executed in
•
some sequential order, with the operations of each
The choice of consistency model makes significant
individual processor being executed in that
differences in both the performance attained by a DSM
sequence in the order ordered by its program.
system and the complexity of programming. Weaker
Release Consistency (RC): Gharachorloo et al.
consistency
designed RC, which provides some relaxation to
performance but are much more sensitive to details of
models
generally
provide
better
the constraints of SC and offers the capabilities of
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
510
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
correct programming, requiring prevention of data
scalability and are most commonly used in many-
races to ensure correctness.
core processors.
Recent work has focused on developing adaptive
3.
Hypercube:
Processors
interconnected
in
models of consistency that adapt their behavior
hypercube topology provides short paths between
dynamically based on application requirements and
any two nodes. This sort of topology offers
system conditions. For example, Yu and Cox (2009)
excellent scalability but implementation may be
proposed a protocol for adaptive release consistency
which dynamically switches between eager versus lazy
quite complex for large systems.
Fat Tree: A tree type structure where bandwidth
4.
adaptation based on runtime information and
increases towards root providing high bisection
demonstrates
bandwidth.
superior
performance
for
several
applications.
Fat
trees
are
widely
adopted
structures for high performance clusters.
5.
Torus: The extension of the mesh network where
edges wrap around to form toroidal structures,
thus improving upon the communication paths
than simple mesh networks.
Quite extensively, the impact of network topology on
DSM performance has been studied. To cite an instance,
Laudon and Lenoski 1997 have demonstrated that the
multiprocessor DASH could use a mesh-based topology
and achieve important near-linear speedup for a
variety of parallel applications.
This bar chart compares different DSM consistency
models based
communication
on their ordering constraints,
overhead,
and
programming
complexity. The chart uses a scale of 1-4 to represent
relative scores for each attribute.
2.3. Multiprocessor System Topologies
Multi-processor system topologies are distribution of a
physical or logical type with regards to the
arrangement of processors and their interconnection
in a distributed system. The actual topology affects the
Recent works have implemented NoCs for multi-core
performance, scalability, and fault tolerance of the
processors
DSM systems (Hennessy & Patterson, 2011).
implementation of DSM systems. An application-
Some common multi-processor topologies include:
1. Bus-based Systems: Here, every processor has
directed NoC architecture, proposed by Kumar et al.
access to a common bus. This type of system is
very easy to implement, but the scalability of the
application. In such an architecture, better
performance is achieved compared to a traditional
system is highly affected by the contention
homogeneous
between the bus elements.
implementation of a simple DSM system, consider the
Mesh networks: Processors are formed as a grid
following Python code which demonstrates a basic
with each processor connected to its immediate
page-based
2.
neighbors.
Mesh
networks
provide
and
can
be
considered
as
the
(2002), adapts to the communication pattern of the
design.
For
discussion
on
the
good
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
511
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
III. AUTOMATED TESTING STRATEGIES FOR
DSM
3.1 Test Generation Techniques
3.1.1
Model-Based Test Generation
Model-based test generation creates test cases for DSM
systems by building a formal model of the system,
abstracting its memory access, consistency, and interprocess communication. Finite state machines are
normally used as an abstraction of the system's
behavior. Leung and White, (1989) proposed a method
of generating test cases from the FSM to be adapted for
the purposes of testing in distributed systems. This
method describes DSM states and memory transitions.
Another model-based approach is Petri nets, which are
best suited for concurrent systems. Carreira and Costa
(1997) applied colored Petri nets in order to produce
test cases, analyzing interleaving scenarios in an
attempt to find race conditions and synchronization
problems. UML state machines and activity diagrams
are the latest novelties. Garousi et al. (2008) suggested
the generation of stress tests based on the UML model.
This approach focused attention on concurrent access
to shared resources, helping to find bottlenecks and
consistency errors.
This is a very simple example, illustrating the basic
3.1.2
concepts of shared memory access and synchronization
Combinatorial testing encompasses a variety of
in a DSM system. In the real distributed environment,
configurations and input combinations like memory
things were much more complex and entailed
access patterns and network topologies in DSM
additional mechanisms related to inter-node
communication, consistency maintenance, and fault
systems. Pairwise testing, where all input parameters
tolerance.
methods. Kuhn et al. showed in (2004) that pairwise
The theoretical framework of DSM systems still
testing indeed performs well in detecting faults
remains in the development stage as there are issues
without the test cases becoming too unwieldy.
being addressed in these areas. The issues include
Higher strength combinations, 3-way or 4-way
improving
of
combinations do provide better fault detection.
communication, and adapting to new hardware
However, these increase test case counts. Nie and
architectures. Therefore, to meet the requirements for
Leung (2011) and their paper made an attempt at
multi-core and distributed systems, the pressure for
adaptive random combinatorial testing which balances
DSM implementations as well as testing methodologies
higher fault detection with fewer test cases. A different
continues to grow exponentially, which brings in
continuous innovation within this field.
approach was taken by Garvin et al. (2011), where they
scalability,
reducing
overhead
Combinatorial Testing Approaches
pairs are tested, happens to be one of the most efficient
suggested
system-specific
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
constraints
on
the
512
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
combinatorial testing. In such a way, the actually
3.3 Concurrency and Race Condition Identification
generated tests will be comprehensive and valid for
Concurrency problems and race conditions in DSM are
DSM systems.
extremely challenging to identify since such problems
3.2 Fault Injection Mechanisms
can often be intermittent and very hard to reproduce,
The effect of faults can be tested in DSM systems using
and hence test cases alone are not enough.
fault injection, which is the manual injection of errors
These static analysis techniques discover potential race
to test fault tolerance. Hardware-based fault injection
tools, like Arlat et al.'s RIFLE tool (1990), simulate
conditions without running the actual code. It was in
2003 that Engler and Ashcraft developed the tool
hardware faults in multiprocessor systems but is costly
RacerX, with which race conditions as well as
and generally less flexible.
deadlocks in large-scale systems can be found.
Software-based fault injection, though more flexible
However, static analysis may lead to false positives and
and commonly used, can simulate any fault, including
memory corruption or network failures. Network level
will overlook some dynamic runtime issues.
Dynamic analysis tools monitor the execution of a
fault injection specifically is more relevant to DSMs.
program for concurrency faults. For the detection of
Kanawati et al. (1995) proposed the FERRARI tool,
data races, Savage et al. proposed the lockset
which injects faults into the operating system and the
algorithm-based tool called Eraser in 1997. Its variants
application layers. Dawson et al. (1996) developed
have been applied to distributed systems, also known
Orchestra, which simulates message delays, losses, and
corruption to assess the impacts of network-related
as DSM.
Hybrids-Static and dynamic analysis together achieve
failures on DSM systems.
high accuracy with efficiency. Choi et al. (2002)
Recent advances include sophisticated fault injection
showed that static analysis could be applied to guide
techniques that employ machine learning algorithms
dynamic race detection while significantly reducing
that manage the injection of faults, targeting specific
runtime overhead but retaining good detection rates.
critical vulnerabilities. Banzai et al. (2010) detail a
Recent work in predictive analysis stresses trace
system in which critical fault scenarios can be
analysis for predicting concurrency-related problems.
automatically identified in DSM using machine
Huang et al. (2014) suggested MaxSMT, the framework
learning.
that discovers latent concurrency bugs in large-scale
systems, including DSM.
Since DSM systems have been widely utilized in high
performance and data-intensive computing, the
development of more efficient methods of automated
testing is still highly important for releasing more
sophisticated tests, better test coverage, and lower false
positives.
IV. AUTOMATED TEST EXECUTION AND
MONITORING
This grouped bar chart compares the effectiveness and
implementation complexity of different automated
4.1 Parallel Test Execution Frameworks
testing strategies for DSM systems. The scores are
Parallel test execution frameworks are essential for
based on a scale of 0-100, derived from the research
running the DSM system under test because they allow
findings.
multiple test cases across the distributed nodes to be
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
513
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
executed concurrently. This ensures that a reasonable
Distributed tracing systems are also very important to
concurrent scenario is created while minimizing the
monitor DSM systems. Sigelman et al. (2010) presented
overall testing time. GTAC has been very effective in
Dapper:
bringing to light various parallel testing frameworks
Multithreaded Programs, which in its turn inspired
that apply in DSM systems.
tools like Jaeger and Zipkin. These tools give an end-
A good example of such a framework is Selenium Grid,
to-end visibility into the request flows, enabling the
which in fact was primarily designed to test web
applications but is also used for distributed systems
identification of performance bottlenecks as well as the
analysis of system behaviour under different test
testing. In this framework, tests are executed in
settings.
parallel on machines equipped with different operating
4.3 Performance Metrics and Benchmarking
systems; hence it can be useful in implementing DSM
Performance metrics and benchmarks measure the
in heterogeneous environments. Another example is
the TestNG framework developed by Cédric Beust,
efficiency and scalability of DSM systems. The primary
metrics are throughput, latency, memory consistency,
where built-in support for parallel test execution is
and scalability. The SPEC has developed SPECjbb,
ensured and has already been effectively applied in
among other benchmarks, in order to quantify Java
scenarios of DSM testing.
server performance in multi-threaded environments.
The latest innovation for parallel test execution testing
The open-source DSM benchmarks in the above list are
involves Testing-as-a-Service
Among the latest innovations in parallel test execution
often replicated with adaptations. For instance, a
variant of the widely known PARSEC benchmark suite
is the development in cloud-based platforms for testing.
Bienia
For instance, recently Orso and Rothermel (2014) have
implementations
reported on the newly emerged phenomenon of
programs, is an example of an adapted DSM
Testing-as-a-Service (TaaS) platforms which leverage
benchmark. NASA's NAS Parallel Benchmarks (NPB)
cloud infrastructure to provide scalable, on-demand
are tests on parallel and distributed systems, including
testing resources. Such a type of platform would be
DSM, conducted using applications related to CFD.
highly appropriate for DSM testing, as it would easily
Recently, the attention of benchmarks has begun to be
implement large-scale distributed scenarios.
placed on emerging DSM architectures. Ferdman et al.
4.2 Real-time Monitoring and Logging
(2012) developed CloudSuite-a benchmark suite with
It must monitor and log in real time to understand the
scale-out workloads for cloud environment which
behavior of DSM systems under test; thus, it will gain
includes data analytics, serving, and media streaming
insight into real performance, resource usage, and have
workload-thus
a clear view of problems arising in real time. Barham
evaluation.
A
et
Tracing
al.
System
(2008),
by
for
which
executing
well-fitted
for
Millions
assesses
of
DSM
multi-threaded
large-scale
DSM
et al. [7] proposed Magpie, which captures distributed
system behaviors by monitoring events across
operating systems, middleware, and applications.
One very important aspect of DSM testing is log
analysis. The reasons for this are as follows: the Elastic
Stack (Elasticsearch, Logstash, and Kibana) is currently
one of the most popular solutions for collecting,
processing, and visualizing log data coming from
distributed systems; it helps to find patterns or
anomalies in a test run.
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
514
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
modular verification, and it has been successfully used
for concurrent and distributed systems, including DSM.
5.2. Runtime Assertion Checking
All that one has to do is add logical assertions to codes;
during the execution, it will be very evident if there
are any behavioural violations. The good thing is that
runtime assertion checking can identify probable
consistency and synchronization problems that have
otherwise been missed by static analysis.
Java Modeling Language (JML) by Leavens et al. (1999)
This line graph shows the trends of key performance
supports runtime assertion checking for Java programs
metrics (throughput, latency, and consistency) as the
number of processors increases in a DSM system. The
extended with support for concurrent and distributed
systems, thus making it suitable for DSM testing.
x-axis uses a logarithmic scale to better represent the
Recent developments in this area include efficient
exponential growth in the number of processors.
assertion checking of large-scale distributed systems.
Meredith et al. (2012) have proposed JavaMOP, which
V. VERIFICATION AND VALIDATION
TECHNIQUES
is a runtime verification framework to check violations
in DSM systems that monitor distributed Java
applications at runtime by using aspectoriented
5.1. Formal Verification Methods
Formal verification provides mathematical proofs of
programming to instrument code with checks.
correctness for DSM systems, therefore giving strong
Test oracles determine whether a test case has passed
confidence in system behaviour. Model checking is a
or failed. The creation of oracles for DSM systems is
popular technique used for exploring the state space of
involved because of the complex interactions and non-
a given system to confirm that certain properties are
deterministic behavior. An overview of oracle
satisfied. Clarke et al. (1999) provide a comprehensive
survey of model checking for concurrent and
strategies for distributed systems testing Baresi and
distributed systems.
Metamorphic testing: Chen et al. in 1998 introduced
Another verification technique adopted in the process
metamorphic testing as a promising technique that
of DSM verification is theorem proving. The
relies on known relationships between multiple
Isabelle/HOL theorem prover, originally constructed
by Nipkow et al. in 2002, has already been utilized in
executions to overcome the oracle problem. So far, it
verifying the properties of algorithms on DSM. So was
distributed systems, including DSM.
Coq in the task of verifying distributed consensus
Recent advances include machine learning, which is
algorithms; such was one of the algorithms due to
now applied to generate oracles automatically.
which the consistency of DSM could be ensured.
Vanmali et al. (2002) showed how neural networks can
Recent work concentrates on compositional
verification techniques that fight state explosion by
be leveraged to learn about the distributed systems and
verifying components in isolation, and then combining
potential in finding inconsistencies and performance-
the results. Flanagan et al. (2005) presented thread-
related DSM issues.
5.3. Automated Oracles for DSM Testing
Young (2001).
has been used successfully in many parallel and
create oracles to detect anomalies. It has quite good
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
515
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
VI. TEST RESULT ANALYSIS
applied in the analysis of time series data emanating
from distributed systems with good results, implying
6.1. Statistical Test Results Analysis
that subtle temporal patterns could indicate system
This form of statistical analysis is specifically useful in
problems.
the analysis of test output from the DSM system,
6.3. Test Data Visualization Techniques
especially when dealing with large volumes of data
Visualization techniques are quite useful to the tester
resulting from the automated executions of tests.
Hypothesis testing and estimation using a confidence
and developer to understand complex interaction
relationships and performance characteristics in DSM
interval are some of the common methods applied for
systems. Graphical presentation of test results enables
meaningful drawing of insights from test results.
identifiable patterns and anomalies that might not be
Regression analysis proves to be very effective in
evident through raw numerical presentation alone.
gauging the relationship of multiple system parameters
with performance metrics in DSM. For example, Zhou
Heat maps and color-coded matrices are widely used to
graph access patterns and contention in DSM systems
et al. (2004) used multiple regression analysis in order
thus enabling hotspots and potentially performance
to model the performance of DSM under various
bottlenecks to be identified. Node-link diagrams and
workload conditions, thus, outlining factors that the
force-directed graphs are commonly applied to
system is scalable against.
represent the topology and communication patterns in
Recent developments in statistical analysis techniques
for DSM testing include Bayesian inference methods.
distributed systems so as to help in the analysis of
network-related problems.
These may be applied to incorporate prior knowledge
New research for DSM testing in regard to
about system behaviour into the analysis of test results
visualization addresses the development of interactive
in order to provide better accuracy and precision to
and real-time visualization facilities. These facilities
prospective performance predictions and anomaly
allow a tester to inspect their massive dataset
detection.
dynamically zooming into parts of a timeline or system
6.2. Machine Learning for Anomaly Detection
component on need. For example, Adamoli and
The analysis of test results and anomalies in DSM
Hauswirth (2010) have proposed Trevis a trace
systems has come to be led by machine learning
visualization and analysis tool for exploring large-scale
techniques. Supervised learning algorithms, such as
parallel applications' behaviour applied for DSM
SVMs, random forests, and the like, are widely applied
systems.
to classify system behaviours and establish possible
faults from historical test data.
Unsupervised
learning
approaches,
VII. CHALLENGES AND LIMITATIONS
especially
clustering algorithms, have been quite applicable to
7.1. Scalability Issues in Large-Scale Systems
DSM system anomaly detection as deviations from
Testing DSM systems at scale is thus a hard problem
normal patterns. For instance, Xu et al (2009)
because interactions are highly complex and the data
employed a modified K-means clustering algorithm for
volume doubles exponentially with system size.
the identification of anomalies in the performance in
Traditional testing approaches fail to identify emergent
large-scale distributed systems, thus including DSM-
behaviors that are instituted only when the size of the
based systems.
system will be scaled up. Cantin et al. (2005) talk about
Besides, deep learning methodologies have also proved
the challenges of scaling cache coherence protocols for
to be promising approaches for anomaly detection in
DSM systems and the need for an innovative testing
DSM. RNN and LSTM networks have been widely
approach that could alleviate the problems.
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
516
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
One of the alternatives to overcome scalability
problems is emulation and simulation methods. One of
these tools is BigSim, developed by Zheng et al. (2004),
which can simulate huge parallel systems on a small
cluster, thus allowing the tester to analyse the system's
behaviour in other scales, and not at such a large-scale
hardware requirement.
7.2. non-determinism
in
multi-processor
environments
The effect of non-determinism pervades the testing of
systems with multiple processors, even including
This logarithmic plot shows the relationship between
DSM-based systems. This interleaving between the
various processors may give rise to race conditions and
test execution time and two types of coverage: code
timing-dependent bugs that are challenging to
challenges in achieving comprehensive testing for
reproduce and debug. Lu et al. (2008) presents a
DSM systems.
comprehensive
study
on
concurrency
bugs'
characteristics and implications for distributed system
testing.
Methods
for handling
nondeterminism
coverage and state space coverage. It illustrates the
include
VIII.
FUTURE RESEARCH DIRECTIONS
8.1. Integration
with
Emerging
Hardware
exact execution sequences for debugging. For example,
Architectures
As hardware architectures advance, further DSM
the
memory
testing research will have to take into account the
multiprocessing (DMP) was developed by Hower and
challenges emerging technologies like non-volatile
Hill in 2008, which is an environment that provides a
memory, 3D-stacked memory, and heterogeneous
deterministic context for parallel programs but still
computing systems pose. New testing strategies may be
delivers high performance.
7.3. Test Coverage and Completeness
needed for
architectures,
Of course, the very reason exhaustive test coverage is
performance of the DSM implementations.
difficult in DSM systems, with a massive state space
8.2. Cloud-Based DSM Testing Platforms
and with greater complexity due to interaction
The increasing adoption of cloud computing offers a
between distributed components, is that traditional
code coverage metrics may not capture most of the
challenge and a hope to develop scalable, on-demand
testing frameworks for DSM systems. These may
aspects of distributed behavior and thus would not
eventually lead to the development of cloud-native
suffice in applying for assessment of DSM system tests.
testing frameworks that dynamically allocate resources
Recent research has focused on the design of coverage
and simulate large-scale distributed environments
metrics targeted to distributed systems. As an example,
with high fidelity.
Stoller (2002) introduces a notion called partial-order
coverage for testing concurrent systems, which will try
8.3. AI-Driven Test Optimization Strategies
The integration of artificial intelligence and machine
to capture the coverage of different event orderings
learning techniques will allow various aspects of DSM
rather than simple code paths.
testing to be optimized. Future research could utilize
deterministic replay systems, which attempt to replay
idea
of
deterministic
shared
reinforcement
DSM implementations in such
whilst ensuring correctness and
learning
algorithms
that
can
automatically generate and refine test cases or apply
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
517
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
natural language processing techniques in the analysis
5.
of system logs for potential issues.
Invest in developing domain specific benchmarks
and performance metrics that may best portray
DSM system behaviour
IX. CONCLUSION
Following these recommendations and keeping track
of the latest research in this field, organizations should
9.1. Summary of Key Findings
improve
their
capability
of
developing
and
This work covered several aspects of automation for
Distributed Shared Memory testing for multi-
maintaining reliable high-performance DSM systems
across multi-processor environments.
processor systems. Major findings include the
relevance of model-based and combinatorial testing
X. REFERENCES
approaches, the efficiency of fault injection-based
techniques, and runtime monitoring with assertions on
correctness and performance of the system.
[1].
Adamoli, A., & Hauswirth, M. (2010). Trevis: A
9.2. Indicative Implications for Industry and Research
context tree visualization & analysis tool for
The conclusions drawn from this work have profound
performance traces. In Proceedings of the 5th
implications for both industry practice and academic
international
research. For industry, the adoption of an automated
visualization (pp. 153–162).
Adve, S. V., & Gharachorloo, K. (1996). Shared
testing strategy may allow more robust and reliable
DSM implementations, and some of the costs of
[2].
memory
consistency
on
models:
Software
A
tutorial.
Computer, 29(12), 66–76.
development could be recovered with performance
improvements. For researchers, the present work
symposium
[3].
Arlat, J., Aguera, M., Amat, L., Crouzet, Y.,
draws attention to various topics that are worth further
Fabre, J. C., Laprie, J. C., & Powell, D. (1990).
exploration, particularly in areas addressing the
scalability and non-determinism challenges within
Fault injection for dependability validation: A
methodology and some applications. IEEE
DSM testing.
Transactions on Software Engineering, 16(2),
9.3. Recommendations for Implementation
166–182.
Based on the results of this research, the following
[4].
Banzai, T., Koizumi, H., Kanbayashi, R., Imada,
strategies are very strongly recommended for the
T., Hanawa, T., & Sato, M. (2010). D-cloud:
efficient testing of DSM systems with automation.
A combination of static analysis and dynamic
Design of a software testing environment for
reliable distributed systems using cloud
analysis techniques for DSM implementations
computing technology. In 2010 10th IEEE/ACM
could be adopted to detect potential problems.
International Conference on Cluster, Cloud and
The usage of parallel test execution frameworks
Grid Computing (pp. 631–636).
1.
2.
exploited to enhance testing efficiency and scale
Baresi, L., & Young, M. (2001). Test oracles.
Technical Report CIS-TR-01-02, University of
properly.
Oregon, Dept. of Computer and Information
Robust monitoring and logging mechanisms
Science, Eugene, Oregon, USA.
and cloud-based testing platforms can be
3.
should be developed so as to create deep insight at
4.
[5].
[6].
Barham, P., Donnelly, A., Isaacs, R., & Mortier,
times of test execution of system behaviour.
R. (2004). Using magpie for request extraction
Explore machine learning and AI-based anomalies
and workload modelling. In OSDI (Vol. 4, pp.
detection
18–18).
methods
and
test
optimization
techniques
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
518
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
[7].
[8].
[9].
Bershad, B. N., & Zekauskas, M. J. (1991).
deadlocks. ACM SIGOPS Operating Systems
Midway: Shared memory parallel programming
Review, 37(5), 237–252.
with entry consistency for distributed memory
[16]. Ferdman, M., Adileh, A., Kocberber, O., Volos,
multiprocessors. Carnegie-Mellon University,
S., Alisafaee, M., Jevdjic, D., & Falsafi, B. (2012).
Department of Computer Science.
Clearing the clouds: A study of emerging scale-
Bienia, C., Kumar, S., Singh, J. P., & Li, K. (2008).
out workloads on modern hardware. ACM
The PARSEC benchmark suite: Characterization
and architectural implications. In Proceedings of
SIGPLAN Notices, 47(4), 37–48.
[17]. Flanagan, C., Freund, S. N., Qadeer, S., & Seshia,
the 17th international conference on Parallel
S.
A.
(2005).
Modular
verification
of
architectures and compilation techniques (pp.
multithreaded programs. Theoretical Computer
72–81).
Science, 338(1–3), 153–183.
Cantin, J. F., Lipasti, M. H., & Smith, J. E. (2005).
The complexity of verifying memory coherence
[18]. Garvin, B. J., Cohen, M. B., & Dwyer, M. B.
(2011). Evaluating improvements to a meta-
and consistency. IEEE Transactions on Parallel
heuristic search for constrained interaction
and Distributed Systems, 16(7), 663–671.
testing. Empirical Software Engineering, 16(1),
[10]. Carreira, J., & Costa, D. (1997). Automatically
61–102.
verifying an object-oriented specification of the
[19]. Garousi, V., Briand, L. C., & Labiche, Y. (2008).
Steam-Boiler control system. In International
Symposium of Formal Methods Europe (pp. 262–
Traffic-aware stress testing of distributed realtime systems based on UML models using
279). Springer, Berlin, Heidelberg.
genetic algorithms. Journal of Systems and
[11]. Chen, T. Y., Cheung, S. C., & Yiu, S. M. (1998).
Software, 81(2), 161–185.
Metamorphic testing: A new approach for
[20]. Gharachorloo, K., Lenoski, D., Laudon, J.,
generating next test cases. Technical Report
Gibbons, P., Gupta, A., & Hennessy, J. (1990).
HKUST-CS98-01, Department of Computer
Memory consistency and event ordering in
Science, Hong Kong University of Science and
scalable shared-memory multiprocessors. ACM
Technology, Hong Kong.
SIGARCH
[12]. Choi, J. D., Lee, K., Loginov, A., O'Callahan, R.,
Sarkar, V., & Sridharan, M. (2002). Efficient and
Computer
[21]. Hennessy, J. L., & Patterson, D. A. (2011).
Computer
object-oriented programs. In Proceedings of the
approach. Elsevier.
SIGPLAN
Programming
2002
News,
18(2SI), 15–26.
precise data-race detection for multithreaded
ACM
Architecture
architecture:
A
quantitative
Conference
on
[22]. Hower, D. R., & Hill, M. D. (2008). Rerun:
design
and
Exploiting episodes for lightweight memory race
language
implementation (pp. 258–269).
[13]. Clarke, E. M., Grumberg, O., & Peled, D. (1999).
Model checking. MIT press.
recording.
ACM
SIGARCH
Computer
Architecture News, 36(3), 265–276.
[23]. Huang, J., Meredith, P. O., & Rosu, G. (2014).
[14]. Dawson, S., Jahanian, F., & Mitton, T. (1996).
Maximal sound predictive race detection with
ORCHESTRA: A fault injection environment for
control flow abstraction. ACM SIGPLAN
distributed systems. In Proceedings of 26th
Notices, 49(6), 337–348.
International Symposium on Fault-Tolerant
Computing (FTCS-26) (pp. 404–414).
[15]. Engler, D., & Ashcraft, K. (2003). RacerX:
Effective, static detection of race conditions and
[24]. Kanawati, G. A., Kanawati, N. A., & Abraham, J.
A. (1995). FERRARI: A flexible software-based
fault
and
error
injection
system.
IEEE
Transactions on Computers, 44(2), 248–260.
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
519
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
[25]. Keleher, P., Cox, A. L., Dwarkadas, S., &
Zwaenepoel,
W.
(1992).
release
Roşu, G. (2012). An overview of the MOP
consistency for software distributed shared
runtime verification framework. International
memory. In Proceedings of the 19th annual
Journal on Software Tools for Technology
international
Transfer, 14(3), 249–289.
symposium
Lazy
[34]. Meredith, P. O., Jin, D., Griffith, D., Chen, F., &
on
Computer
architecture (pp. 13–21).
[35]. Nelson, J., Holt, B., Myers, B., Briggs, P., Ceze,
[26]. Kuhn, D. R., Wallace, D. R., & Gallo, A. M.
(2004). Software fault interactions and
implications
for
software
testing.
IEEE
Transactions on Software Engineering, 30(6),
418–421.
L., Kahan, S., & Oskin, M. (2015). Latencytolerant software distributed shared memory. In
2015 USENIX Annual Technical Conference
(USENIX ATC 15) (pp. 291–305).
[36]. Nie, C., & Leung, H. (2011). A survey of
[27]. Lamport, L. (1979). How to make a
multiprocessor computer that correctly executes
combinatorial testing. ACM Computing Surveys
(CSUR), 43(2), 1–29.
multiprocess programs. IEEE Transactions on
[37]. Nipkow, T., Paulson, L. C., & Wenzel, M. (2002).
Computers, 100(9), 690–691.
Isabelle/HOL: A proof assistant for higher-order
[28]. Laudon, J., & Lenoski, D. (1997). The SGI Origin:
A ccNUMA highly scalable server. ACM
logic (Vol. 2283). Springer Science & Business
Media.
SIGARCH Computer Architecture News, 25(2),
241–251.
[38]. Orso, A., & Rothermel, G. (2014). Software
testing: A research travelogue (2000–2014).
[29]. Leavens, G. T., Baker, A. L., & Ruby, C. (1999).
[39]. Santhosh Palavesh. (2019). The Role of Open
Preliminary design of JML: A behavioral
Innovation and Crowdsourcing in Generating
interface specification language for Java. ACM
New Business Ideas and Concepts. International
SIGSOFT Software Engineering Notes, 31(3), 1–
Journal for Research Publication and Seminar,
38.
10(4),
[30]. Leung, H. K., & White, L. (1989). Insights into
regression
Proceedings.
testing
(software
Conference
testing).
on
137–147.
https://doi.org/10.36676/jrps.v10.i4.1456
In
[40]. Challa, S. S. S., Tilala, M., Chawda, A. D., &
Software
Benke, A. P. (2019). Investigating the use of
Maintenance-1989 (pp. 60–69). IEEE.
[31]. Li, K., & Hudak, P. (1989). Memory coherence
natural language processing (NLP) techniques in
automating
the
extraction
of
regulatory
in shared virtual memory systems. ACM
requirements from unstructured data sources.
Transactions on Computer Systems (TOCS),
Annals of Pharma Research, 7(5), 380-387.
7(4), 321–359.
[32]. Bhavesh
Kataria,
Characterization
and
Identification of Rice Grains Through Digital
[41]. Challa, S. S., Tilala, M., Chawda, A. D., & Benke,
A. P. (2019). Investigating the use of natural
language
processing
(NLP)
techniques
in
Image Analysis in International Journal –
automating
Sanshodhan, ISSN 0975- 4245, December, 2011
requirements from unstructured data sources.
(Print)
Annals of PharmaResearch, 7(5), 380-387.
[33]. Lu, S., Park, S., Seo, E., & Zhou, Y. (2008).
[42]. Bhavesh
the
extraction
Kataria,
"Role
of
of
regulatory
Information
Learning from mistakes: A comprehensive study
Technology
on real world concurrency bug characteristics.
International Journal of Scientific Research in
ACM SIGARCH Computer Architecture News,
36(1), 329–339.
Science, Engineering and Technology, Print
in
Agriculture
:
A
Review,
ISSN : 2395-1990, Online ISSN : 2394-4099,
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
520
Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521
Volume 1, Issue 1, pp.01-03, 2014. Available at :
https://doi.org/10.32628/ijsrset141115
[43]. Dr. Saloni Sharma, & Ritesh Chaturvedi. (2017).
http://asianssr.org/index.php/ajct/article/view/4
43
[50]. Tripathi, A. (2019). Serverless architecture
Blockchain Technology in Healthcare Billing:
patterns:
Enhancing
Security.
microservices, and serverless APIs. International
International Journal for Research Publication
Journal of Creative Research Thoughts (IJCRT),
and Seminar, 10(2), 106–117. Retrieved from
7(3),
234-239.
http://www.ijcrt.org
Transparency
and
https://jrps.shodhsagar.com/index.php/j/article/
Deep
dive
into
event-driven,
Retrieved
from
view/1475
[44]. Bhaskar, V. V. S. R., Etikani, P., Shiva, K.,
[51]. Kanchetti, D., Munirathnam, R., & Thakkar, D.
Choppadandi, A., & Dave, A. (2019). Building
XML shredding for external data integration.
explainable AI systems with federated learning
Journal of Contemporary Scientific Research,
3(8). ISSN (Online) 2209-0142.
on the cloud. Journal of Cloud Computing and
Artificial Intelligence, 16(1), 1–14.
(2019). Innovations in workers compensation:
[52]. Aravind Reddy Nayani, Alok Gupta, Prassanna
[45]. Bhavesh Kataria, Analysis of Rice Grains
Through Digital Image Processing, SCI-TECH
Selvaraj, Ravi Kumar Singh, & Harsh Vaidya.
Research (National Journal) ISSN 0974 – 9780,
with the Help of Artificial Intelligence.
February, 2012 (Print)
International Journal for Research Publication
and
Seminar,
10(4),
148–166.
[46]. Big Data Analytics using Machine Learning
(2019). Search and Recommendation Procedure
Techniques on Cloud Platforms. (2019).
International Journal of Business Management
[53]. Rinkesh Gajera , "Leveraging Procore for
and Visuals, ISSN: 3006-2705, 2(2), 54-58.
Improved Collaboration and Communication in
https://ijbmv.com/index.php/home/article/view
Multi-Stakeholder
/76
International Journal of Scientific Research in
[47]. Secure Federated Learning Framework for
https://doi.org/10.36676/jrps.v10.i4.1503
Construction
Projects",
Civil Engineering (IJSRCE), ISSN : 2456-6667,
Distributed Ai Model Training in Cloud
Environments. (2019). International Journal of
[54]. Gudimetla, S. R., et al. (2015). Mastering Azure
Open Publication and Exploration, ISSN: 3006-
AD: Advanced techniques for enterprise identity
2853,
management. Neuroquantology, 13(1), 158-163.
7(1),
31-39.
https://ijope.com/index.php/home/article/view/
145
Volume 3, Issue 3, pp.47-51, May-June.2019
https://doi.org/10.48047/nq.2015.13.1.792
[55]. Gudimetla, S. R., & et al. (2015). Beyond the
[48]. Challa, S. S. S., Tilala, M., Chawda, A. D., &
Benke, A. P. (2019). Investigating the use of
barrier:
Advanced
implementation
strategies
for
firewall
and
management.
natural language processing (NLP) techniques in
NeuroQuantology,
automating
https://doi.org/10.48047/nq.2015.13.4.876
the
extraction
of
regulatory
requirements from unstructured data sources.
Annals of Pharma Research, 7(5),
[49]. Ghavate, N. (2018). An Computer Adaptive
13(4),
558-565.
[56]. Bhavesh Kataria, "Variant of RSA-Multi prime
RSA,
International
Research
in
Journal
Science,
of
Scientific
Engineering
and
Testing Using Rule Based. Asian Journal For
Technology, Print ISSN : 2395-1990, Online
Convergence In Technology (AJCT) ISSN -2350-
ISSN : 2394-4099, Volume 1, Issue 1, pp.09-11,
2014.
Available
at
https://doi.org/10.32628/ijsrset14113
1146,
4(I).
Retrieved
from
International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com
521