Academia.eduAcademia.edu

Software Complexity Metrics: A Survey

AI-generated Abstract

This survey examines various software complexity metrics essential for measuring and controlling software complexity, a crucial factor in maintaining software quality, understandability, and manageability. Key traditional metrics discussed include Weighted Methods per Class (WMC), Depth of Inheritance (DIT), Response For Class (RFC), Coupling Between Objects (CBO), Lack of Cohesion Method (LCOM), and Number of Children (NOC), highlighting their definitions, implications for software design, and how each metric contributes to an overall understanding of software complexity.

Volume 3, Issue 8, August 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.iiarcsse.com Software Complexity Metrics: A Survey Dr. P. Chitti Babu* Professor&Principal, APGCCS pcb_mca@yahoo.com A. Narasimha Prasad MCA Department, APGCCS anp.aits@gmail.com D. Sudhakar MCA Department, APGCCS dsudhakar2all@gmail.com Abstract— Managing software complexity is the most important technical topic in software development. Software’s primary technical importance is to manage complexity, because the complexity of software affects many other attributes like testability and maintainability etc. The researchers proposed many software complexity metrics and most of the metrics are based on the source code of the software. Software Complexity metrics evaluate how complex a product is and help to improve the product. Complexity metrics is a kind of internal metrics visible only to the development team. In this paper various traditional complexity metrics and object oriented complexity metrics relevant for measuring the software product complexity have been discussed with their limitations and a solution has been proposed. Keywords—software complexity, complexity metrics, object oriented metrics, internal metrics, software quality I. INTRODUCTION Complexity is the quality of consisting of many interrelated parts. When software consists of many interrelated parts, it becomes more difficult to reason about. Software projects, especially larger software projects absolutely must have code that is understandable. When designing software applications, a key component to the architecture is managing complexity to a workable level. Software must be made modular so that a developer can work on one section of code without worrying about the rest of the project. Measuring and controlling the software complexity is an important aspect during each software development paradigm. Because, the complexity of software affects many other attributes like testability and maintainability etc. So the researchers proposed many software complexity metrics and most of the metrics are based on the source code of the software. Object-Oriented Analysis and Design of software provide many benefits such as reusability, decomposition of problem into easily understood object and the aiding of future modifications. To deal with software complexity, the problem has to be split into sub-problems and sub-tasks so as to contain the complexity to one part of the code. II. REVISION OF VARIOUS EXISTING METRICS Software complexity cannot be removed completely but it can be controlled. For controlling the software complexity effectively, software complexity metrics are required to measure it. So many researchers have proposed various metrics for evaluating and predicting software complexity. This section describes various metrics which may be applied for measuring software complexity. A. Traditional Software Complexity Metrics Traditional software complexity metrics have been designed and applied for measuring the software complexity of structured systems since 1976. Among these metrics developers often found that Control Flow Complexity (McCabe Metric), Language Complexity (Halstead Metric), Interface Complexity (Henry fan-in/fan-out Metric), Data Flow Complexity (Elshoff Metric), Decisional Complexity (McClure Metric), Data Complexity (Chapin Metric) are the most commonly used metrics [3,6,10]. 1) Control Flow Complexity (McCabe Metric): McCabe’s metrics are based on a control flow representation of the program. A program graph is used to depict control flow. Nodes represent processing tasks (one or more code statements). Edges represent control flow between nodes [1,8,13]. McCabe’s Complexity Metric counts the number of distinct paths through a block of code. It takes its name from counting the number of cycles in the program flow control graph. Lower values are better; McCabe suggested using ten as a threshold value, beyond which modules should be split into smaller units. The control flow complexity measured in terms of total number of edges in control flow graph. The following Fig. 1 shows flow graph notation. © 2013, IJARCSSE All Rights Reserved Page|1359 Chitti Babu et al., International Jour ournal of Advanced Research in Computer Science andd So Software Engineering 3(8), Aug - 2013, pp. 1359-1362 August Fig.1 Flow graph notation 2) Language Complexity (Halstead Me Metric): The Halstead complexity metrics use the inputs in of total and distinct operators and operands to compute the he volume, difficulty, and effort of a piece of code. H Halstead metric is a way of determining a quantitative measure off ccomplexity directly from the operators and operandss in the module [2,10,13]. It measures complexity by summarizing the number of operators and operands contained in a program. pro The Halstead metric measures are based on four scalar numbe bers derived directly from a program’s source code. n1 = the number of unique or distinct ct ooperators n2 = the number of unique or distinct ct ooperands N1 = the total number of operators per N2 = the total number of operands period. Example: if (k < 2) { if (k > 3) x = x*k; } Distinct operators: if ( ) { } > < = * ; Distinct operands: k 2 3 x n1 = 10 n2 = 4 N1 = 13 N2 = 7 From these numbers, five measures are derived: TABLE I FIVE MEASURES FOR LANGUAGE COMPLEXITY Measu sure Program ram Length Program ram Vocabulary Volum me Difficu iculty Effort Symbol N n V D E Formula N = N1 + N2 n = n1 + n2 V = N log2 (n) D = (n1/2) * (N2/n2) E=D*V 3) Interface Complexity (Henry fan-in/fa n/fan-out Metric): Interface Complexity measures comp mplexity as a function of fan in and fan out [9]. Fan-in is defined ass tthe number of local flows into that procedure plus the number of data structures from which that procedure retrieves info nformation. Fan-out is defined as the number of locall fflows out of that procedure plus the number of data structures that th the procedure updates. In other words Fan-in is the sum of th the procedures called, parameters read, and global varia riables read and fan out is the sum of procedures that call a given proc rocedure, parameters written to (exposed to outside user sers, passed in by reference), and global variables written to. Highh fan-in f and fan-out might be indicative of a module le that might be difficult to understand. Comp mplexity = (Procedure Length) * (fan-in * fan-out)2 4) Data Flow Complexity (Elshoff Metri etric): The data-flow metric is based on the number of variables v referenced but not defined in a program block (i.e. sequence nce of statements between branches) [3,4]. A variable de definition is defined to be the assignment of a value to a variable; var variable reference is defined to be the use of a variable ble in an expression or as an output. Eg:- Oviedo’s complexity metric is a wei eighted sum of a control-flow metric and a data-flow metric. m Oviedo’s metric C= aCF + bDF Where CF is control flow complexity me measured in terms of total number of edges in control flow flo graph; DF is the data flow complexity,, w which is measured in terms of the sum of data flow complexity c of each block in the program. The data flow complexityy of o each block is the number of variables referenced but not defined in each block; © 2013, IJARCSSE All Rights Reserved Page|1360 Chitti Babu et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(8), August - 2013, pp. 1359-1362 a and b are weighting factors that may be assumed to be 1. 5) Decisional Complexity (McClure Metric): Decisional Complexity is the sum of number of comparisons in a module and number of control variables referenced in the module [3]. Complexity = C + V Where C is the number of comparisons in a module V is the number of control variables referenced in the module This metric is similar to McCabe’s but with regard to control variables. 6) Data Complexity (Chapin Metric): The data complexity describes how much information is being used in a method. The claim is that, more the variables, hard it is to understand, modify, etc and fewer the variables less complex the code is [3,7]. The value of a metric for a given method is P + 2M + 3C + 0.5T Where P = number of input variables (including global variables) that are used in calculating the output M = number of variables that are modified or created in the method C = number of variables that participate in determining control flow T = number of variables that are unused But these traditional software complexity metrics are applicable at the method level. B. Object Oriented Software Complexity Metrics: Object oriented metrics which can be applied to analyze source code as an indicator of quality attributes. The source code could be any OO language. The most impressive findings related to Object– Oriented metrics were the one proposed by Chidamber and Kemerer. They have proposed six complexity metrics in particular designed to address object oriented software. These are direct measures. These complexity metrics are Weighted Methods per Class (WMC), Depth of Inheritance (DIT), Response For Class (RFC), Coupling Between Objects (CBO), Lack of Cohesion Method (LCOM) and Number of Children (NOC) [2,5,11]. 1) Weighted Methods per Class (WMC): Weighted Methods per Class metric is intended to count the combined complexity of local methods in a given class. WMC is defined as the sum of complexity of a class’s local methods. n WMC = ∑c i Where i =1 Ci is the complexity (e.g., volume, cyclomatic complexity, etc.) of each method. Viewpoints: o The number of methods and complexity of methods is an indicator of how much time and effort is required to develop and maintain the object. o The larger the number of methods in an object, the greater the potential impact on the children. o Objects with large number of methods are likely to be more application specific, limiting the possible reuse. 2) Depth of Inheritance (DIT): Depth of Inheritance metric is for class. The depth of a class within the inheritance hierarchy is the maximum number of steps from the class node to the root of the tree. DIT is the maximum length from a node to the root (base class). It is measured by the number of ancestor classes. But it represents the greater potential for reuse of inherited methods. Viewpoints: o Lower level subclasses inherit a number of methods making behavior harder to predict. o Deeper trees indicate greater design complexity. 3) Response For Class (RFC): RFC is the number of methods that could be called in response to a message to a class (local + remote). The high value of RFC shows high complexity of the class. Viewpoints: As RFC increases o testing effort increases o greater the complexity of the object o harder it is to understand 4) Coupling Between Objects (CBO): For a given class, this metric measures the number of other classes to which the class is coupled. CBO is the number of collaborations between two classes (fan-out of a class C). The number of other classes that are referenced in the class C (a reference to another class, A, is a reference to a method or a data member of class A). Viewpoints: o As collaboration increases reuse decreases o High fan-outs represent class coupling to other classes/objects and thus are undesirable o High fan-ins represent good object designs and high level of reuse Not possible to maintain high fan-in and low fan outs across the entire system 5) Lack of Cohesion Method (LCOM): The cohesion of a class is characterized by how closely the local methods are related to the local instance variables in the class. LCOM gives the number of disjoint sets of local methods [2,11]. LCOM is the number of empty intersections minus the number of non-empty intersections. This is a notion of degree of © 2013, IJARCSSE All Rights Reserved Page|1361 Chitti Babu et al., International Journal of Advanced Research in Computer Science and Software Engineering 3(8), August - 2013, pp. 1359-1362 6) similarity of methods. If two methods use common instance variables then they are similar. LCOM of zero is not maximally cohesive. |P| = |Q| or |P| < |Q| Consider Class Ck with n methods M1,…Mn and Ij is the set of instance variables used by Mj. There are n such sets I1 ,…, In - P = {(Ii, Ij) | (Ii ∩ Ij ) = ∅} - Q = {(Ii, Ij) | (Ii ∩ Ij ) ≠ ∅} If all n sets Ii are ∅ then P = ∅ LCOM = |P| - |Q|, if |P| > |Q| LCOM = 0 otherwise Example: Take class C with M1, M2, M3 I1 = {a, b, c, d, e} I2 = {a, b, e} I3 = {x, y, z} P = {(I1, I3), (I2, I3)} Q = {(I1, I2)} Thus LCOM = 1. 7) Number of Children (NOC): NOC is the number of subclasses immediately subordinate to a class. This metric gives the number of immediate successors of the class [12]. Viewpoints: o As NOC grows, reuse increases - but the abstraction may be diluted o Depth is generally better than breadth in class hierarchy, since it promotes reuse of methods through inheritance o Classes higher up in the hierarchy should have more sub-classes then those lower down o NOC gives an idea of the potential influence a class has on the design: classes with large number of children may require more testing Various flaws and inconsistencies have been observed in this suite of six class-based metrics. III. CONCLUSIONS Although many researchers proposed various types of complexity metrics for measuring software complexity, users of complexity metrics must be aware of the limitations of these metrics and approach their applications cautiously. These metrics can only be calculated after a major development effort has been committed to produce the source codes. They cannot provide early feedback during the specification phase; and subsequently it is expensive to make changes to the system, if so indicated by the metrics. Object-oriented metrics are being used to evaluate and predict the quality of software. It need to be noted that the validity of these metrics can sometimes be criticized. Many things, including fatigue and mental and physical stress, can affect the performance of programmers with resultant impact on external metrics. The only thing that can be reasonably stated is that the empirical relationship between software product metrics are not very likely to be strong because there are other effects that are not accounted for. But, as they have been demonstrated in a number of studies, they can still be useful in practice. REFERENCES [1] McCabe T, "A Software Complexity Measure", IEEE Trans. Software Engineering SE-2 (4), pp 308-320, 1976. [2] Seyyed Mohsen Jamali, “Object Oriented Metrics,” Department of Computer Engineering, Sharif University of Technology, January 2006. [3] Software Reliability handbook by paul rook, kluwer academic publishers, ISBN 1-85166-400-9, 455-460. [4] Software Engineering, 8rd edition, by Ian Sommerville. Copyright: 1989 Addison-Wesley Publishing Co., 337-345. [5] Churcher, N. I. and Shepperd, M. J., "Comments on 'A Metrics Suite for Object-Oriented Design'," IEEE Transactions on Software Engineering, vol. 21, pp. 263-5, 1995. [6] Measuring 75 Million Lines of Code A Report from the Field, Metrikon, Harry M. Sneed, Anecon, Vienna. [7] Li, H. F., and Cheung, W. K. "An Empirical Study of Software Metrics." IEEE Transactions on Software Engineering, Volume SE-13, Number 6, June 1987, pp.697-708. [8] McCabe, T. J. "Structured Testing: A Software Testing Methodology using the Cyclomatic Complexity Metric.” National Bureau of Standards Special Publication 500-99, December 1982. [9] H. Zuse, Software Complexity, Walter de Gruyter, Berlin, 1991. [10] Sheng Yu and Shijie Zhou, “A Survey on Metric of Software Complexity,” School of Computer Science and Engineering, University of Electronic Science and Technology, China. [11] Shyam R. Chidamber and Chris F. Kemerer, “A Metrics Suite for Object Oriented Design” IEEE Transactions on Software Engineering, 20, No 6, pp. 476-49, 1994. [12] A. Fothi, J. Nyeky-Gaizler and Z. Porkolab, “The Structured Complexity of Object-Oriented Programs,” Mathematical and Computer Modeling, 2003. [13] P. Chitti Babu, K.C.K. Bharathi, “Assessment of Maintainability Factor”, International Journal of Computer Science Engineering and Information Technology Research(IJCSITR), vol. 3, pp. 29-42, 2013. © 2013, IJARCSSE All Rights Reserved Page|1362