Academia.eduAcademia.edu

12154.pdf

This research paper explores the critical domain of automated testing for Distributed Shared Memory (DSM) systems in multi-processor environments. As the complexity of multi-core and distributed computing systems continues to grow, ensuring the reliability and performance of DSM implementations becomes increasingly challenging. This study investigates various automated testing strategies, including test generation techniques, fault injection mechanisms, and concurrency detection methods. It also examines automated test execution frameworks, real-time monitoring solutions, and advanced verification and validation techniques. The research highlights the challenges faced in DSM testing, such as scalability issues and non-determinism, and proposes future directions for research, including the integration of artificial intelligence and cloud-based testing platforms. The findings of this study contribute to the advancement of DSM testing methodologies and provide valuable insights for both researchers and practitioners in the field of distributed systems and parallel computing.

International Journal of Scientific Research in Science, Engineering and Technology Print ISSN - 2395-1990 Online ISSN : 2394-4099 Available Online at : www.ijsrset.com doi : https://doi.org/10.32628/IJSRSET12411594 Automation in Distributed Shared Memory Testing for MultiProcessor Systems Swethasri Kavuri Independent Researcher, USA ARTICLE INFO ABSTRACT This research paper explores the critical domain of automated testing for Article History: Accepted: 20 May 2019 Published: 30 May 2019 Distributed Shared Memory (DSM) systems in multi-processor environments. As the complexity of multi-core and distributed computing systems continues to grow, ensuring the reliability and performance of DSM implementations becomes increasingly challenging. This study Publication Issue : Volume 6, Issue 3 May-June-2019 Page Number : 508-521 investigates various automated testing strategies, including test generation techniques, fault injection mechanisms, and concurrency detection methods. It also examines automated test execution frameworks, real-time monitoring solutions, and advanced verification and validation techniques. The research highlights the challenges faced in DSM testing, such as scalability issues and non-determinism, and proposes future directions for research, including the integration of artificial intelligence and cloud-based testing platforms. The findings of this study contribute to the advancement of DSM testing methodologies and provide valuable insights for both researchers and practitioners in the field of distributed systems and parallel computing. Keywords: Distributed Shared Memory, Multi-Processor Systems, Automated Testing, Fault Injection, Concurrency Detection, Formal Verification, Performance Benchmarking, Parallel Computing I. INTRODUCTION advantages of scale and fault tolerance of a distributed system (Tanenbaum & van Steen, 2017). It was in the 1.1 Distributed Shared Memory (DSM) Systems Context DSM systems are a rather important paradigm for parallel and distributed computing, providing one mid-1980's that DSM first came into being. Since then, it has assumed many complexities to meet the rising needs of high-performance computing as well as largescale data processing. uniform memory abstraction, situated physically across distributed, distributed memory modules. DSM systems will attempt to combine the programming simplicity of the shared memory models with the Copyright © 2024 The Author(s) : This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) 508 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 This pie chart illustrates the distribution of challenges in DSM testing, highlighting the relative importance of each challenge based on the research findings. 1.3 Research Objectives and Scope 1. This research addresses these objectives: 2. Critical analysis and evaluation of the approaches for testing automation tailored specifically for DSM applications in a multi-processor environment. 3. State-of-art test generation techniques and fault injection methodologies to enable thorough 1.2 Testing of Multi-Processor Systems The test of multi-processor systems; more particularly 4. well as real-time monitoring framework to DSM poses a challenge unto itself. It includes: I. II. manage test activities effectively. Concurrency issues: Race conditions, deadlocks, and livelocks are very difficult to identify and reproduce. 5. deterministic behaviour. III. Scalability: The complexity of testing does not Verification and validation approaches: formal methods, runtime assertion checking, etc., in Non-determinism: Interleaving operations across multiple processors is most likely to lead to non- testing of DSMs. Automation of frameworks for test execution as 6. DSM systems. Discuss the challenges and limitations resulting from DSM testing and potential future research avenues. grow linearly with the number of processors but exponentially as the number of processors This research entails the scope in both software and increases. hardware aspects in DSM testing, with a focus on IV. Memory consistency: Most models of consistency automated approaches that can be applied to augment have to be implemented and then maintained reliability, performance, and scalability of multi- appropriately throughout the system. Performance variation: Considering network processor systems using DSM architectures. V. latency and other aspects as well as the protocols II. THEORETICAL FRAMEWORK of cache coherence, system performance variation should be taken into account. 2.1. Distributed Shared Memory Architecture In DSM architecture, every processor in a distributed system gets access to a global address space. This abstraction enables processes resident on different nodes in sharing data as if there existed a single, shared memory, even though the memory resides on several machines (Protic et al., 1996). DSM systems can be broadly categorized into two categories: hardware-based and software-based. The hardware-based DSM systems, in addition to the Stanford DASH multiprocessor, rely on hardware support in order to achieve coherence and consistency. International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 509 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 The software-based DSM systems, such as Trademarks reordering and Munin, implement the shared memory abstraction synchronization points, thus performance. entirely in software in order to enhance flexibility at • memory operations between Lazy Release Consistency (LRC): LRC is an potential loss of performance. optimization of RC, designed by Keleher et al. In DSM System Implementation: Key Components this LRC, modification propagation is delayed [1]. Memory Management: It performs allocation and until the next synchronization operation is deallocation of shared memory region. [2]. Consistency Protocol: It monitors an operation in encountered. As a result, communication overhead. the memory based on the consistency model, such as consistency. [3]. Communication reduces Entry Consistency (EC): Bershad and Zekauskas (1991) proposed EC. It links shared variables to Subsystem: This subsystem governs passing messages between nodes in order to transfer data, which makes a node to achieve synchronization with other nodes. [4]. Coherence • it Mechanism: This synchronization objects, whose consistency management capabilities were fine-grained. Table 2: Comparison of these consistency models in mechanism characteristics and performance impacts: performs coherence of shared data across multiple Consiste Ordering Communica Program caches. ncy Constraints tion ming Overhead Complexit Recent developments in DSM architectures are hybrid systems that integrate shared memory and message Model passing paradigms. For instance, the runtime system provided by Nelson et al. (2015) called Grappa Sequenti al provided a DSM abstraction on commodity clusters Consiste with improved performance for irregular applications. ncy 2.2. Consistency Models DSM Systems Memory consistency models define what rules govern which memory operations are ordered and made y Strict global High order Low Release Relaxed Medium Medium Consiste between ncy synchroniza tion points contract between the programmer and the system, Lazy Further Low Medium stating how memory operations will behave, according Release relaxed, to Adve & Gharachorloo, 1996. Consiste delayed Some common consistency models are: ncy propagation Entry Fine- Very Low High Lamport in 1979, SC ensures the result of any Consiste grained, execution is the same as what would be produced ncy data-centric visible in a DSM system. These models define a • Sequential Consistency (SC): Proposed by if all operations of all processors were executed in • some sequential order, with the operations of each The choice of consistency model makes significant individual processor being executed in that differences in both the performance attained by a DSM sequence in the order ordered by its program. system and the complexity of programming. Weaker Release Consistency (RC): Gharachorloo et al. consistency designed RC, which provides some relaxation to performance but are much more sensitive to details of models generally provide better the constraints of SC and offers the capabilities of International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 510 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 correct programming, requiring prevention of data scalability and are most commonly used in many- races to ensure correctness. core processors. Recent work has focused on developing adaptive 3. Hypercube: Processors interconnected in models of consistency that adapt their behavior hypercube topology provides short paths between dynamically based on application requirements and any two nodes. This sort of topology offers system conditions. For example, Yu and Cox (2009) excellent scalability but implementation may be proposed a protocol for adaptive release consistency which dynamically switches between eager versus lazy quite complex for large systems. Fat Tree: A tree type structure where bandwidth 4. adaptation based on runtime information and increases towards root providing high bisection demonstrates bandwidth. superior performance for several applications. Fat trees are widely adopted structures for high performance clusters. 5. Torus: The extension of the mesh network where edges wrap around to form toroidal structures, thus improving upon the communication paths than simple mesh networks. Quite extensively, the impact of network topology on DSM performance has been studied. To cite an instance, Laudon and Lenoski 1997 have demonstrated that the multiprocessor DASH could use a mesh-based topology and achieve important near-linear speedup for a variety of parallel applications. This bar chart compares different DSM consistency models based communication on their ordering constraints, overhead, and programming complexity. The chart uses a scale of 1-4 to represent relative scores for each attribute. 2.3. Multiprocessor System Topologies Multi-processor system topologies are distribution of a physical or logical type with regards to the arrangement of processors and their interconnection in a distributed system. The actual topology affects the Recent works have implemented NoCs for multi-core performance, scalability, and fault tolerance of the processors DSM systems (Hennessy & Patterson, 2011). implementation of DSM systems. An application- Some common multi-processor topologies include: 1. Bus-based Systems: Here, every processor has directed NoC architecture, proposed by Kumar et al. access to a common bus. This type of system is very easy to implement, but the scalability of the application. In such an architecture, better performance is achieved compared to a traditional system is highly affected by the contention homogeneous between the bus elements. implementation of a simple DSM system, consider the Mesh networks: Processors are formed as a grid following Python code which demonstrates a basic with each processor connected to its immediate page-based 2. neighbors. Mesh networks provide and can be considered as the (2002), adapts to the communication pattern of the design. For discussion on the good International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 511 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 III. AUTOMATED TESTING STRATEGIES FOR DSM 3.1 Test Generation Techniques 3.1.1 Model-Based Test Generation Model-based test generation creates test cases for DSM systems by building a formal model of the system, abstracting its memory access, consistency, and interprocess communication. Finite state machines are normally used as an abstraction of the system's behavior. Leung and White, (1989) proposed a method of generating test cases from the FSM to be adapted for the purposes of testing in distributed systems. This method describes DSM states and memory transitions. Another model-based approach is Petri nets, which are best suited for concurrent systems. Carreira and Costa (1997) applied colored Petri nets in order to produce test cases, analyzing interleaving scenarios in an attempt to find race conditions and synchronization problems. UML state machines and activity diagrams are the latest novelties. Garousi et al. (2008) suggested the generation of stress tests based on the UML model. This approach focused attention on concurrent access to shared resources, helping to find bottlenecks and consistency errors. This is a very simple example, illustrating the basic 3.1.2 concepts of shared memory access and synchronization Combinatorial testing encompasses a variety of in a DSM system. In the real distributed environment, configurations and input combinations like memory things were much more complex and entailed access patterns and network topologies in DSM additional mechanisms related to inter-node communication, consistency maintenance, and fault systems. Pairwise testing, where all input parameters tolerance. methods. Kuhn et al. showed in (2004) that pairwise The theoretical framework of DSM systems still testing indeed performs well in detecting faults remains in the development stage as there are issues without the test cases becoming too unwieldy. being addressed in these areas. The issues include Higher strength combinations, 3-way or 4-way improving of combinations do provide better fault detection. communication, and adapting to new hardware However, these increase test case counts. Nie and architectures. Therefore, to meet the requirements for Leung (2011) and their paper made an attempt at multi-core and distributed systems, the pressure for adaptive random combinatorial testing which balances DSM implementations as well as testing methodologies higher fault detection with fewer test cases. A different continues to grow exponentially, which brings in continuous innovation within this field. approach was taken by Garvin et al. (2011), where they scalability, reducing overhead Combinatorial Testing Approaches pairs are tested, happens to be one of the most efficient suggested system-specific International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com constraints on the 512 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 combinatorial testing. In such a way, the actually 3.3 Concurrency and Race Condition Identification generated tests will be comprehensive and valid for Concurrency problems and race conditions in DSM are DSM systems. extremely challenging to identify since such problems 3.2 Fault Injection Mechanisms can often be intermittent and very hard to reproduce, The effect of faults can be tested in DSM systems using and hence test cases alone are not enough. fault injection, which is the manual injection of errors These static analysis techniques discover potential race to test fault tolerance. Hardware-based fault injection tools, like Arlat et al.'s RIFLE tool (1990), simulate conditions without running the actual code. It was in 2003 that Engler and Ashcraft developed the tool hardware faults in multiprocessor systems but is costly RacerX, with which race conditions as well as and generally less flexible. deadlocks in large-scale systems can be found. Software-based fault injection, though more flexible However, static analysis may lead to false positives and and commonly used, can simulate any fault, including memory corruption or network failures. Network level will overlook some dynamic runtime issues. Dynamic analysis tools monitor the execution of a fault injection specifically is more relevant to DSMs. program for concurrency faults. For the detection of Kanawati et al. (1995) proposed the FERRARI tool, data races, Savage et al. proposed the lockset which injects faults into the operating system and the algorithm-based tool called Eraser in 1997. Its variants application layers. Dawson et al. (1996) developed have been applied to distributed systems, also known Orchestra, which simulates message delays, losses, and corruption to assess the impacts of network-related as DSM. Hybrids-Static and dynamic analysis together achieve failures on DSM systems. high accuracy with efficiency. Choi et al. (2002) Recent advances include sophisticated fault injection showed that static analysis could be applied to guide techniques that employ machine learning algorithms dynamic race detection while significantly reducing that manage the injection of faults, targeting specific runtime overhead but retaining good detection rates. critical vulnerabilities. Banzai et al. (2010) detail a Recent work in predictive analysis stresses trace system in which critical fault scenarios can be analysis for predicting concurrency-related problems. automatically identified in DSM using machine Huang et al. (2014) suggested MaxSMT, the framework learning. that discovers latent concurrency bugs in large-scale systems, including DSM. Since DSM systems have been widely utilized in high performance and data-intensive computing, the development of more efficient methods of automated testing is still highly important for releasing more sophisticated tests, better test coverage, and lower false positives. IV. AUTOMATED TEST EXECUTION AND MONITORING This grouped bar chart compares the effectiveness and implementation complexity of different automated 4.1 Parallel Test Execution Frameworks testing strategies for DSM systems. The scores are Parallel test execution frameworks are essential for based on a scale of 0-100, derived from the research running the DSM system under test because they allow findings. multiple test cases across the distributed nodes to be International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 513 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 executed concurrently. This ensures that a reasonable Distributed tracing systems are also very important to concurrent scenario is created while minimizing the monitor DSM systems. Sigelman et al. (2010) presented overall testing time. GTAC has been very effective in Dapper: bringing to light various parallel testing frameworks Multithreaded Programs, which in its turn inspired that apply in DSM systems. tools like Jaeger and Zipkin. These tools give an end- A good example of such a framework is Selenium Grid, to-end visibility into the request flows, enabling the which in fact was primarily designed to test web applications but is also used for distributed systems identification of performance bottlenecks as well as the analysis of system behaviour under different test testing. In this framework, tests are executed in settings. parallel on machines equipped with different operating 4.3 Performance Metrics and Benchmarking systems; hence it can be useful in implementing DSM Performance metrics and benchmarks measure the in heterogeneous environments. Another example is the TestNG framework developed by Cédric Beust, efficiency and scalability of DSM systems. The primary metrics are throughput, latency, memory consistency, where built-in support for parallel test execution is and scalability. The SPEC has developed SPECjbb, ensured and has already been effectively applied in among other benchmarks, in order to quantify Java scenarios of DSM testing. server performance in multi-threaded environments. The latest innovation for parallel test execution testing The open-source DSM benchmarks in the above list are involves Testing-as-a-Service Among the latest innovations in parallel test execution often replicated with adaptations. For instance, a variant of the widely known PARSEC benchmark suite is the development in cloud-based platforms for testing. Bienia For instance, recently Orso and Rothermel (2014) have implementations reported on the newly emerged phenomenon of programs, is an example of an adapted DSM Testing-as-a-Service (TaaS) platforms which leverage benchmark. NASA's NAS Parallel Benchmarks (NPB) cloud infrastructure to provide scalable, on-demand are tests on parallel and distributed systems, including testing resources. Such a type of platform would be DSM, conducted using applications related to CFD. highly appropriate for DSM testing, as it would easily Recently, the attention of benchmarks has begun to be implement large-scale distributed scenarios. placed on emerging DSM architectures. Ferdman et al. 4.2 Real-time Monitoring and Logging (2012) developed CloudSuite-a benchmark suite with It must monitor and log in real time to understand the scale-out workloads for cloud environment which behavior of DSM systems under test; thus, it will gain includes data analytics, serving, and media streaming insight into real performance, resource usage, and have workload-thus a clear view of problems arising in real time. Barham evaluation. A et Tracing al. System (2008), by for which executing well-fitted for Millions assesses of DSM multi-threaded large-scale DSM et al. [7] proposed Magpie, which captures distributed system behaviors by monitoring events across operating systems, middleware, and applications. One very important aspect of DSM testing is log analysis. The reasons for this are as follows: the Elastic Stack (Elasticsearch, Logstash, and Kibana) is currently one of the most popular solutions for collecting, processing, and visualizing log data coming from distributed systems; it helps to find patterns or anomalies in a test run. International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 514 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 modular verification, and it has been successfully used for concurrent and distributed systems, including DSM. 5.2. Runtime Assertion Checking All that one has to do is add logical assertions to codes; during the execution, it will be very evident if there are any behavioural violations. The good thing is that runtime assertion checking can identify probable consistency and synchronization problems that have otherwise been missed by static analysis. Java Modeling Language (JML) by Leavens et al. (1999) This line graph shows the trends of key performance supports runtime assertion checking for Java programs metrics (throughput, latency, and consistency) as the number of processors increases in a DSM system. The extended with support for concurrent and distributed systems, thus making it suitable for DSM testing. x-axis uses a logarithmic scale to better represent the Recent developments in this area include efficient exponential growth in the number of processors. assertion checking of large-scale distributed systems. Meredith et al. (2012) have proposed JavaMOP, which V. VERIFICATION AND VALIDATION TECHNIQUES is a runtime verification framework to check violations in DSM systems that monitor distributed Java applications at runtime by using aspectoriented 5.1. Formal Verification Methods Formal verification provides mathematical proofs of programming to instrument code with checks. correctness for DSM systems, therefore giving strong Test oracles determine whether a test case has passed confidence in system behaviour. Model checking is a or failed. The creation of oracles for DSM systems is popular technique used for exploring the state space of involved because of the complex interactions and non- a given system to confirm that certain properties are deterministic behavior. An overview of oracle satisfied. Clarke et al. (1999) provide a comprehensive survey of model checking for concurrent and strategies for distributed systems testing Baresi and distributed systems. Metamorphic testing: Chen et al. in 1998 introduced Another verification technique adopted in the process metamorphic testing as a promising technique that of DSM verification is theorem proving. The relies on known relationships between multiple Isabelle/HOL theorem prover, originally constructed by Nipkow et al. in 2002, has already been utilized in executions to overcome the oracle problem. So far, it verifying the properties of algorithms on DSM. So was distributed systems, including DSM. Coq in the task of verifying distributed consensus Recent advances include machine learning, which is algorithms; such was one of the algorithms due to now applied to generate oracles automatically. which the consistency of DSM could be ensured. Vanmali et al. (2002) showed how neural networks can Recent work concentrates on compositional verification techniques that fight state explosion by be leveraged to learn about the distributed systems and verifying components in isolation, and then combining potential in finding inconsistencies and performance- the results. Flanagan et al. (2005) presented thread- related DSM issues. 5.3. Automated Oracles for DSM Testing Young (2001). has been used successfully in many parallel and create oracles to detect anomalies. It has quite good International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 515 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 VI. TEST RESULT ANALYSIS applied in the analysis of time series data emanating from distributed systems with good results, implying 6.1. Statistical Test Results Analysis that subtle temporal patterns could indicate system This form of statistical analysis is specifically useful in problems. the analysis of test output from the DSM system, 6.3. Test Data Visualization Techniques especially when dealing with large volumes of data Visualization techniques are quite useful to the tester resulting from the automated executions of tests. Hypothesis testing and estimation using a confidence and developer to understand complex interaction relationships and performance characteristics in DSM interval are some of the common methods applied for systems. Graphical presentation of test results enables meaningful drawing of insights from test results. identifiable patterns and anomalies that might not be Regression analysis proves to be very effective in evident through raw numerical presentation alone. gauging the relationship of multiple system parameters with performance metrics in DSM. For example, Zhou Heat maps and color-coded matrices are widely used to graph access patterns and contention in DSM systems et al. (2004) used multiple regression analysis in order thus enabling hotspots and potentially performance to model the performance of DSM under various bottlenecks to be identified. Node-link diagrams and workload conditions, thus, outlining factors that the force-directed graphs are commonly applied to system is scalable against. represent the topology and communication patterns in Recent developments in statistical analysis techniques for DSM testing include Bayesian inference methods. distributed systems so as to help in the analysis of network-related problems. These may be applied to incorporate prior knowledge New research for DSM testing in regard to about system behaviour into the analysis of test results visualization addresses the development of interactive in order to provide better accuracy and precision to and real-time visualization facilities. These facilities prospective performance predictions and anomaly allow a tester to inspect their massive dataset detection. dynamically zooming into parts of a timeline or system 6.2. Machine Learning for Anomaly Detection component on need. For example, Adamoli and The analysis of test results and anomalies in DSM Hauswirth (2010) have proposed Trevis a trace systems has come to be led by machine learning visualization and analysis tool for exploring large-scale techniques. Supervised learning algorithms, such as parallel applications' behaviour applied for DSM SVMs, random forests, and the like, are widely applied systems. to classify system behaviours and establish possible faults from historical test data. Unsupervised learning approaches, VII. CHALLENGES AND LIMITATIONS especially clustering algorithms, have been quite applicable to 7.1. Scalability Issues in Large-Scale Systems DSM system anomaly detection as deviations from Testing DSM systems at scale is thus a hard problem normal patterns. For instance, Xu et al (2009) because interactions are highly complex and the data employed a modified K-means clustering algorithm for volume doubles exponentially with system size. the identification of anomalies in the performance in Traditional testing approaches fail to identify emergent large-scale distributed systems, thus including DSM- behaviors that are instituted only when the size of the based systems. system will be scaled up. Cantin et al. (2005) talk about Besides, deep learning methodologies have also proved the challenges of scaling cache coherence protocols for to be promising approaches for anomaly detection in DSM systems and the need for an innovative testing DSM. RNN and LSTM networks have been widely approach that could alleviate the problems. International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 516 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 One of the alternatives to overcome scalability problems is emulation and simulation methods. One of these tools is BigSim, developed by Zheng et al. (2004), which can simulate huge parallel systems on a small cluster, thus allowing the tester to analyse the system's behaviour in other scales, and not at such a large-scale hardware requirement. 7.2. non-determinism in multi-processor environments The effect of non-determinism pervades the testing of systems with multiple processors, even including This logarithmic plot shows the relationship between DSM-based systems. This interleaving between the various processors may give rise to race conditions and test execution time and two types of coverage: code timing-dependent bugs that are challenging to challenges in achieving comprehensive testing for reproduce and debug. Lu et al. (2008) presents a DSM systems. comprehensive study on concurrency bugs' characteristics and implications for distributed system testing. Methods for handling nondeterminism coverage and state space coverage. It illustrates the include VIII. FUTURE RESEARCH DIRECTIONS 8.1. Integration with Emerging Hardware exact execution sequences for debugging. For example, Architectures As hardware architectures advance, further DSM the memory testing research will have to take into account the multiprocessing (DMP) was developed by Hower and challenges emerging technologies like non-volatile Hill in 2008, which is an environment that provides a memory, 3D-stacked memory, and heterogeneous deterministic context for parallel programs but still computing systems pose. New testing strategies may be delivers high performance. 7.3. Test Coverage and Completeness needed for architectures, Of course, the very reason exhaustive test coverage is performance of the DSM implementations. difficult in DSM systems, with a massive state space 8.2. Cloud-Based DSM Testing Platforms and with greater complexity due to interaction The increasing adoption of cloud computing offers a between distributed components, is that traditional code coverage metrics may not capture most of the challenge and a hope to develop scalable, on-demand testing frameworks for DSM systems. These may aspects of distributed behavior and thus would not eventually lead to the development of cloud-native suffice in applying for assessment of DSM system tests. testing frameworks that dynamically allocate resources Recent research has focused on the design of coverage and simulate large-scale distributed environments metrics targeted to distributed systems. As an example, with high fidelity. Stoller (2002) introduces a notion called partial-order coverage for testing concurrent systems, which will try 8.3. AI-Driven Test Optimization Strategies The integration of artificial intelligence and machine to capture the coverage of different event orderings learning techniques will allow various aspects of DSM rather than simple code paths. testing to be optimized. Future research could utilize deterministic replay systems, which attempt to replay idea of deterministic shared reinforcement DSM implementations in such whilst ensuring correctness and learning algorithms that can automatically generate and refine test cases or apply International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 517 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 natural language processing techniques in the analysis 5. of system logs for potential issues. Invest in developing domain specific benchmarks and performance metrics that may best portray DSM system behaviour IX. CONCLUSION Following these recommendations and keeping track of the latest research in this field, organizations should 9.1. Summary of Key Findings improve their capability of developing and This work covered several aspects of automation for Distributed Shared Memory testing for multi- maintaining reliable high-performance DSM systems across multi-processor environments. processor systems. Major findings include the relevance of model-based and combinatorial testing X. REFERENCES approaches, the efficiency of fault injection-based techniques, and runtime monitoring with assertions on correctness and performance of the system. [1]. Adamoli, A., & Hauswirth, M. (2010). Trevis: A 9.2. Indicative Implications for Industry and Research context tree visualization & analysis tool for The conclusions drawn from this work have profound performance traces. In Proceedings of the 5th implications for both industry practice and academic international research. For industry, the adoption of an automated visualization (pp. 153–162). Adve, S. V., & Gharachorloo, K. (1996). Shared testing strategy may allow more robust and reliable DSM implementations, and some of the costs of [2]. memory consistency on models: Software A tutorial. Computer, 29(12), 66–76. development could be recovered with performance improvements. For researchers, the present work symposium [3]. Arlat, J., Aguera, M., Amat, L., Crouzet, Y., draws attention to various topics that are worth further Fabre, J. C., Laprie, J. C., & Powell, D. (1990). exploration, particularly in areas addressing the scalability and non-determinism challenges within Fault injection for dependability validation: A methodology and some applications. IEEE DSM testing. Transactions on Software Engineering, 16(2), 9.3. Recommendations for Implementation 166–182. Based on the results of this research, the following [4]. Banzai, T., Koizumi, H., Kanbayashi, R., Imada, strategies are very strongly recommended for the T., Hanawa, T., & Sato, M. (2010). D-cloud: efficient testing of DSM systems with automation. A combination of static analysis and dynamic Design of a software testing environment for reliable distributed systems using cloud analysis techniques for DSM implementations computing technology. In 2010 10th IEEE/ACM could be adopted to detect potential problems. International Conference on Cluster, Cloud and The usage of parallel test execution frameworks Grid Computing (pp. 631–636). 1. 2. exploited to enhance testing efficiency and scale Baresi, L., & Young, M. (2001). Test oracles. Technical Report CIS-TR-01-02, University of properly. Oregon, Dept. of Computer and Information Robust monitoring and logging mechanisms Science, Eugene, Oregon, USA. and cloud-based testing platforms can be 3. should be developed so as to create deep insight at 4. [5]. [6]. Barham, P., Donnelly, A., Isaacs, R., & Mortier, times of test execution of system behaviour. R. (2004). Using magpie for request extraction Explore machine learning and AI-based anomalies and workload modelling. In OSDI (Vol. 4, pp. detection 18–18). methods and test optimization techniques International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 518 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 [7]. [8]. [9]. Bershad, B. N., & Zekauskas, M. J. (1991). deadlocks. ACM SIGOPS Operating Systems Midway: Shared memory parallel programming Review, 37(5), 237–252. with entry consistency for distributed memory [16]. Ferdman, M., Adileh, A., Kocberber, O., Volos, multiprocessors. Carnegie-Mellon University, S., Alisafaee, M., Jevdjic, D., & Falsafi, B. (2012). Department of Computer Science. Clearing the clouds: A study of emerging scale- Bienia, C., Kumar, S., Singh, J. P., & Li, K. (2008). out workloads on modern hardware. ACM The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of SIGPLAN Notices, 47(4), 37–48. [17]. Flanagan, C., Freund, S. N., Qadeer, S., & Seshia, the 17th international conference on Parallel S. A. (2005). Modular verification of architectures and compilation techniques (pp. multithreaded programs. Theoretical Computer 72–81). Science, 338(1–3), 153–183. Cantin, J. F., Lipasti, M. H., & Smith, J. E. (2005). The complexity of verifying memory coherence [18]. Garvin, B. J., Cohen, M. B., & Dwyer, M. B. (2011). Evaluating improvements to a meta- and consistency. IEEE Transactions on Parallel heuristic search for constrained interaction and Distributed Systems, 16(7), 663–671. testing. Empirical Software Engineering, 16(1), [10]. Carreira, J., & Costa, D. (1997). Automatically 61–102. verifying an object-oriented specification of the [19]. Garousi, V., Briand, L. C., & Labiche, Y. (2008). Steam-Boiler control system. In International Symposium of Formal Methods Europe (pp. 262– Traffic-aware stress testing of distributed realtime systems based on UML models using 279). Springer, Berlin, Heidelberg. genetic algorithms. Journal of Systems and [11]. Chen, T. Y., Cheung, S. C., & Yiu, S. M. (1998). Software, 81(2), 161–185. Metamorphic testing: A new approach for [20]. Gharachorloo, K., Lenoski, D., Laudon, J., generating next test cases. Technical Report Gibbons, P., Gupta, A., & Hennessy, J. (1990). HKUST-CS98-01, Department of Computer Memory consistency and event ordering in Science, Hong Kong University of Science and scalable shared-memory multiprocessors. ACM Technology, Hong Kong. SIGARCH [12]. Choi, J. D., Lee, K., Loginov, A., O'Callahan, R., Sarkar, V., & Sridharan, M. (2002). Efficient and Computer [21]. Hennessy, J. L., & Patterson, D. A. (2011). Computer object-oriented programs. In Proceedings of the approach. Elsevier. SIGPLAN Programming 2002 News, 18(2SI), 15–26. precise data-race detection for multithreaded ACM Architecture architecture: A quantitative Conference on [22]. Hower, D. R., & Hill, M. D. (2008). Rerun: design and Exploiting episodes for lightweight memory race language implementation (pp. 258–269). [13]. Clarke, E. M., Grumberg, O., & Peled, D. (1999). Model checking. MIT press. recording. ACM SIGARCH Computer Architecture News, 36(3), 265–276. [23]. Huang, J., Meredith, P. O., & Rosu, G. (2014). [14]. Dawson, S., Jahanian, F., & Mitton, T. (1996). Maximal sound predictive race detection with ORCHESTRA: A fault injection environment for control flow abstraction. ACM SIGPLAN distributed systems. In Proceedings of 26th Notices, 49(6), 337–348. International Symposium on Fault-Tolerant Computing (FTCS-26) (pp. 404–414). [15]. Engler, D., & Ashcraft, K. (2003). RacerX: Effective, static detection of race conditions and [24]. Kanawati, G. A., Kanawati, N. A., & Abraham, J. A. (1995). FERRARI: A flexible software-based fault and error injection system. IEEE Transactions on Computers, 44(2), 248–260. International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 519 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 [25]. Keleher, P., Cox, A. L., Dwarkadas, S., & Zwaenepoel, W. (1992). release Roşu, G. (2012). An overview of the MOP consistency for software distributed shared runtime verification framework. International memory. In Proceedings of the 19th annual Journal on Software Tools for Technology international Transfer, 14(3), 249–289. symposium Lazy [34]. Meredith, P. O., Jin, D., Griffith, D., Chen, F., & on Computer architecture (pp. 13–21). [35]. Nelson, J., Holt, B., Myers, B., Briggs, P., Ceze, [26]. Kuhn, D. R., Wallace, D. R., & Gallo, A. M. (2004). Software fault interactions and implications for software testing. IEEE Transactions on Software Engineering, 30(6), 418–421. L., Kahan, S., & Oskin, M. (2015). Latencytolerant software distributed shared memory. In 2015 USENIX Annual Technical Conference (USENIX ATC 15) (pp. 291–305). [36]. Nie, C., & Leung, H. (2011). A survey of [27]. Lamport, L. (1979). How to make a multiprocessor computer that correctly executes combinatorial testing. ACM Computing Surveys (CSUR), 43(2), 1–29. multiprocess programs. IEEE Transactions on [37]. Nipkow, T., Paulson, L. C., & Wenzel, M. (2002). Computers, 100(9), 690–691. Isabelle/HOL: A proof assistant for higher-order [28]. Laudon, J., & Lenoski, D. (1997). The SGI Origin: A ccNUMA highly scalable server. ACM logic (Vol. 2283). Springer Science & Business Media. SIGARCH Computer Architecture News, 25(2), 241–251. [38]. Orso, A., & Rothermel, G. (2014). Software testing: A research travelogue (2000–2014). [29]. Leavens, G. T., Baker, A. L., & Ruby, C. (1999). [39]. Santhosh Palavesh. (2019). The Role of Open Preliminary design of JML: A behavioral Innovation and Crowdsourcing in Generating interface specification language for Java. ACM New Business Ideas and Concepts. International SIGSOFT Software Engineering Notes, 31(3), 1– Journal for Research Publication and Seminar, 38. 10(4), [30]. Leung, H. K., & White, L. (1989). Insights into regression Proceedings. testing (software Conference testing). on 137–147. https://doi.org/10.36676/jrps.v10.i4.1456 In [40]. Challa, S. S. S., Tilala, M., Chawda, A. D., & Software Benke, A. P. (2019). Investigating the use of Maintenance-1989 (pp. 60–69). IEEE. [31]. Li, K., & Hudak, P. (1989). Memory coherence natural language processing (NLP) techniques in automating the extraction of regulatory in shared virtual memory systems. ACM requirements from unstructured data sources. Transactions on Computer Systems (TOCS), Annals of Pharma Research, 7(5), 380-387. 7(4), 321–359. [32]. Bhavesh Kataria, Characterization and Identification of Rice Grains Through Digital [41]. Challa, S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2019). Investigating the use of natural language processing (NLP) techniques in Image Analysis in International Journal – automating Sanshodhan, ISSN 0975- 4245, December, 2011 requirements from unstructured data sources. (Print) Annals of PharmaResearch, 7(5), 380-387. [33]. Lu, S., Park, S., Seo, E., & Zhou, Y. (2008). [42]. Bhavesh the extraction Kataria, "Role of of regulatory Information Learning from mistakes: A comprehensive study Technology on real world concurrency bug characteristics. International Journal of Scientific Research in ACM SIGARCH Computer Architecture News, 36(1), 329–339. Science, Engineering and Technology, Print in Agriculture : A Review, ISSN : 2395-1990, Online ISSN : 2394-4099, International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 520 Swethasri Kavuri Int J Sci Res Sci Eng Technol, May-June-2019, 6 (3) : 508-521 Volume 1, Issue 1, pp.01-03, 2014. Available at : https://doi.org/10.32628/ijsrset141115 [43]. Dr. Saloni Sharma, & Ritesh Chaturvedi. (2017). http://asianssr.org/index.php/ajct/article/view/4 43 [50]. Tripathi, A. (2019). Serverless architecture Blockchain Technology in Healthcare Billing: patterns: Enhancing Security. microservices, and serverless APIs. International International Journal for Research Publication Journal of Creative Research Thoughts (IJCRT), and Seminar, 10(2), 106–117. Retrieved from 7(3), 234-239. http://www.ijcrt.org Transparency and https://jrps.shodhsagar.com/index.php/j/article/ Deep dive into event-driven, Retrieved from view/1475 [44]. Bhaskar, V. V. S. R., Etikani, P., Shiva, K., [51]. Kanchetti, D., Munirathnam, R., & Thakkar, D. Choppadandi, A., & Dave, A. (2019). Building XML shredding for external data integration. explainable AI systems with federated learning Journal of Contemporary Scientific Research, 3(8). ISSN (Online) 2209-0142. on the cloud. Journal of Cloud Computing and Artificial Intelligence, 16(1), 1–14. (2019). Innovations in workers compensation: [52]. Aravind Reddy Nayani, Alok Gupta, Prassanna [45]. Bhavesh Kataria, Analysis of Rice Grains Through Digital Image Processing, SCI-TECH Selvaraj, Ravi Kumar Singh, & Harsh Vaidya. Research (National Journal) ISSN 0974 – 9780, with the Help of Artificial Intelligence. February, 2012 (Print) International Journal for Research Publication and Seminar, 10(4), 148–166. [46]. Big Data Analytics using Machine Learning (2019). Search and Recommendation Procedure Techniques on Cloud Platforms. (2019). International Journal of Business Management [53]. Rinkesh Gajera , "Leveraging Procore for and Visuals, ISSN: 3006-2705, 2(2), 54-58. Improved Collaboration and Communication in https://ijbmv.com/index.php/home/article/view Multi-Stakeholder /76 International Journal of Scientific Research in [47]. Secure Federated Learning Framework for https://doi.org/10.36676/jrps.v10.i4.1503 Construction Projects", Civil Engineering (IJSRCE), ISSN : 2456-6667, Distributed Ai Model Training in Cloud Environments. (2019). International Journal of [54]. Gudimetla, S. R., et al. (2015). Mastering Azure Open Publication and Exploration, ISSN: 3006- AD: Advanced techniques for enterprise identity 2853, management. Neuroquantology, 13(1), 158-163. 7(1), 31-39. https://ijope.com/index.php/home/article/view/ 145 Volume 3, Issue 3, pp.47-51, May-June.2019 https://doi.org/10.48047/nq.2015.13.1.792 [55]. Gudimetla, S. R., & et al. (2015). Beyond the [48]. Challa, S. S. S., Tilala, M., Chawda, A. D., & Benke, A. P. (2019). Investigating the use of barrier: Advanced implementation strategies for firewall and management. natural language processing (NLP) techniques in NeuroQuantology, automating https://doi.org/10.48047/nq.2015.13.4.876 the extraction of regulatory requirements from unstructured data sources. Annals of Pharma Research, 7(5), [49]. Ghavate, N. (2018). An Computer Adaptive 13(4), 558-565. [56]. Bhavesh Kataria, "Variant of RSA-Multi prime RSA, International Research in Journal Science, of Scientific Engineering and Testing Using Rule Based. Asian Journal For Technology, Print ISSN : 2395-1990, Online Convergence In Technology (AJCT) ISSN -2350- ISSN : 2394-4099, Volume 1, Issue 1, pp.09-11, 2014. Available at https://doi.org/10.32628/ijsrset14113 1146, 4(I). Retrieved from International Journal of Scientific Research in Science, Engineering and Technology | www.ijsrset.com 521