Cohesion Metrics
Cohesion Metrics
Cohesion Metrics
other. A cohesive class performs one function. A non-cohesive class performs two or more unrelated functions. A non-cohesive class may need to be restructured into two or more smaller classes. The assumption behind the following cohesion metrics is that methods are related if they work on the same class-level variables. Methods are unrelated if they work on different variables altogether. In a cohesive class, methods work with the same set of variables. In a non-cohesive class, there are some methods that work on different data. See also Object-oriented metrics The idea of the cohesive class A cohesive class performs one function. Lack of cohesion means that a class performs more than one function. This is not desirable. If a class performs several unrelated functions, it should be split up.
High cohesion is desirable since it promotes encapsulation. As a drawback, a highly cohesive class has high coupling between the methods of the class, which in turn indicates high testing effort for that class. Low cohesion indicates inappropriate design and high complexity. It has also been found to indicate a high likelihood of errors. The class should probably be split into two or more smaller classes.
LCOM metrics Lack of Cohesion of Methods. This group of metrics aims to detect problem classes. A high LCOM value means low cohesion. TCC and LCC metrics: Tight and Loose Class Cohesion. This group of metrics aims to tell the difference of good and bad cohesion. With these metrics, large values are good and low values are bad. Cohesion diagrams visualize class cohesion. Non-cohesive classes report suggests which classes should be split and how.
LCOM Lack of Cohesion of Methods There are several LCOM lack of cohesion of methods metrics. Project Analyzer provides 4 variants: LCOM1, LCOM2, LCOM3 and LCOM4. We recommend the use of LCOM4 for Visual Basic systems. The other variants may be of scientific interest. LCOM4 (Hitz & Montazeri) recommended metric LCOM4 is the lack of cohesion metric we recommend for Visual Basic programs. LCOM4 measures the number of "connected components" in a class. A connected component is a set of
1
related methods (and class-level variables). There should be only one such a component in each class. If there are 2 or more components, the class should be split into so many smaller classes. Which methods are related? Methods a and b are related if: 1. they both access the same class-level variable, or 2. a calls b, or b calls a. After determining the related methods, we draw a graph linking the related methods to each other. LCOM4 equals the number of connected groups of methods.
LCOM4=1 indicates a cohesive class, which is the "good" class. LCOM4>=2 indicates a problem. The class should be split into so many smaller classes. LCOM4=0 happens when there are no methods in a class. This is also a "bad" class.
The example on the left shows a class consisting of methods A through E and variables x and y. A calls B and B accesses x. Both C and D access y. D calls E, but E doesn't access any variables. This class consists of 2 unrelated components (LCOM4=2). You could split it as {A, B, x} and {C, D, E, y}.
In the example on the right, we made C access x to increase cohesion. Now the class consists of a single component (LCOM4=1). It is a cohesive class.
It is to be noted that UserControls as well as VB.NET forms and web pages frequently report high LCOM4 values. Even if the value exceeds 1, it does not often make sense to split the control, form or web page as it would affect the user interface of your program. The explanation with UserControls is that they store information in the the underlying UserControl object. The explanation with VB.NET is the form designer generated code that you cannot modify. Implementation details for LCOM4. We use the same definition for a method as with the WMC metric. This means that property accessors are considered regular methods, but inherited
2
methods are not taken into account. Both Shared and non-Shared variables and methods are considered. We ignore empty procedures, though. Empty procedures tend to increase LCOM4 as they do not access any variables or other procedures. A cohesive class with empty procedures would have a high LCOM4. Sometimes empty procedures are required (for classic VB implements, for example). This is why we simply drop empty procedures from LCOM4. We also ignore constructors and destructors (Sub New, Finalize, Class_Initialize, Class_Terminate). Constructors and destructors frequently set and clear all variables in the class, making all methods connected through these variables, which increases cohesion artificially. Suggested use. Use the Non-cohesive classes report and Cohesion diagrams to determine how the classes could be split. It is good to remove dead code before searching for uncohesive classes. Dead procedures can increase LCOM4 as the dead parts can be disconnected from the other parts of the class. Readings for LCOM4
Hitz M., Montazeri B.: Measuring Coupling and Cohesion In Object-Oriented Systems. Proc. Int. Symposium on Applied Corporate Computing, Oct. 25-27, Monterrey, Mexico, 75-76, 197, 78-84. http://www.isys.uni-klu.ac.at/PDF/1995-0043MHBM.pdf Includes definition of LCOM4 named as "Improving LCOM".
LCOM1, LCOM2 and LCOM3 less suitable for VB LCOM1, LCOM2 and LCOM3 are not as suitable for Visual Basic projects as LCOM4. They are less accurate especially as they don't consider the impact of property accessors and procedure calls, which are both frequently used to access the values of variables in a cohesive way. They may be more appropriate to other object-oriented languages such as C++. We provide these metrics for the sake of completeness. You can use them as complementary metrics in addition to LCOM4. LCOM1 Chidamber & Kemerer LCOM1 was introduced in the Chidamber & Kemerer metrics suite. Its also called LCOM or LOCOM, and its calculated as follows: Take each pair of methods in the class. If they access disjoint sets of instance variables, increase P by one. If they share at least one variable access, increase Q by one. LCOM1 = P - Q , if P > QLCOM1 = 0 otherwise LCOM1 = 0 indicates a cohesive class. LCOM1 > 0 indicates that the class needs or can be split into two or more classes, since its variables belong in disjoint sets. Classes with a high LCOM1 have been found to be fault-prone. A high LCOM1 value indicates disparateness in the functionality provided by the class. This metric can be used to identify classes that are attempting to achieve many different objectives, and consequently are likely to behave in less predictable ways than classes that have lower LCOM1 values. Such classes could be more error prone and more difficult to test and could possibly be disaggregated into two or more classes that are more well defined in their behavior. The LCOM1 metric can be used by senior designers and project managers as a relatively simple
3
way to track whether the cohesion principle is adhered to in the design of an application and advise changes. LCOM1 critique LCOM1 has received its deal of critique. It has been shown to have a number of drawbacks, so it should be used with caution. First, LCOM1 gives a value of zero for very different classes. To overcome that problem, new metrics, LCOM2 and LCOM3, have been suggested (see below). Second, Gupta suggests that LCOM1 is not a valid way to measure cohesiveness of a class. Thats because its definition is based on method-data interaction, which may not be a correct way to define cohesiveness in the object-oriented world. Moreover, very different classes may have an equal LCOM1. Third, as LCOM1 is defined on variable access, it's not well suited for classes that internally access their data via properties. A class that gets/sets its own internal data via its own properties, and not via direct variable read/write, may show a high LCOM1. This is not an indication of a problematic class. LCOM1 is not suitable for measuring such classes. Implementation details. The definition of LCOM1 deals with instance variables but all methods of a class. Class variables (Shared variables in VB.NET) are not taken into account. On the contrary, all the methods are taken into account, whether Shared or not. Project Analyzer assumes that a procedure in a class is a method if it can have code in it. Thus, Subs, Functions and each of Property Get/Set/Let are methods, whereas a DLL declare or Event declaration are not methods. What is more, empty procedure definitions, such as abstract MustOverride procedures in VB.NET, are not methods. Readings for LCOM1 Shyam R. Chidamber, Chris F. Kemerer. A Metrics suite for Object Oriented design. M.I.T. Sloan School of Management E53-315. 1993.http://uweb.txstate.edu/~mg43/CS5391/Papers/Metrics/OOMetrics.pdf Victor Basili, Lionel Briand and Walcelio Melo. A Validation of ObjectOriented Design Metrics as Quality Indicators. IEEE Transactions on Software Engineering. Vol. 22, No. 10, October 1996. http://www.cs.umd.edu/users/basili/publications/journals/J60.pdf Bindu S. Gupta: A Critique of Cohesion Measures in the ObjectOriented Paradigm. Master of Science Thesis. Michigan Technological University, Department of Computer Science. 1997.
LCOM2 and LCOM3 (Henderson-Sellers, Constantine & Graham) To overcome the problems of LCOM1, two additional metrics have been proposed: LCOM2 and LCOM3. A low value of LCOM2 or LCOM3 indicates high cohesion and a well-designed class. It is likely that the system has good class subdivision implying simplicity and high reusability. A cohesive class will tend to provide a high degree of encapsulation. A higher value of LCOM2 or LCOM3 indicates decreased encapsulation and increased complexity, thereby increasing the likelihood of errors.
4
Which one to choose, LCOM2 or LCOM3? This is a matter of taste. LCOM2 and LCOM3 are similar measures with different formulae. LCOM3 varies in the range [0,1] while LCOM2 is in the range [0,2]. LCOM2>=1 indicates a very problematic class. LCOM3 has no single threshold value. It is a good idea to remove any dead variables before interpreting the values of LCOM2 or LCOM3. Dead variables can lead to high values of LCOM2 and LCOM3, thus leading to wrong interpretations of what should be done. DEFINITIONS USED FOR LCOM2 AND LCOM3 m number of procedures (methods) in class a number of variables (attributes) in class mA number of methods that access a variable (attribute) sum(mA) sum of mA over attributes of a class Implementation details. m is equal to WMC. a contains all variables whether Shared or not. All accesses to a variable are counted. LCOM2 LCOM2 = 1 - sum(mA)/(m*a) LCOM2 equals the percentage of methods that do not access a specific attribute averaged over all attributes in the class. If the number of methods or attributes is zero, LCOM2 is undefined and displayed as zero. LCOM3 alias LCOM* LCOM3 = (m - sum(mA)/a) / (m-1) LCOM3 varies between 0 and 2. Values 1..2 are considered alarming. In a normal class whose methods access the class's own variables, LCOM3 varies between 0 (high cohesion) and 1 (no cohesion). When LCOM3=0, each method accesses all variables. This indicates the highest possible cohesion. LCOM3=1 indicates extreme lack of cohesion. In this case, the class should be split. When there are variables that are not accessed by any of the class's methods, 1 < LCOM3 <= 2. This happens if the variables are dead or they are only accessed outside the class. Both cases represent a design flaw. The class is a candidate for rewriting as a module. Alternatively, the class variables should be encapsulated with accessor methods or properties. There may also be some dead variables to remove. If there are no more than one method in a class, LCOM3 is undefined. If there are no variables in a class, LCOM3 is undefined. An undefined LCOM3 is displayed as zero. Readings for LCOM2/LCOM3 Henderson-Sellers, B, L, Constantine and I, Graham , 'Coupling and Cohesion (Towards a Valid Metrics Suite for Object-Oriented Analysis and Design)', Object-Oriented Systems, 3(3), pp143-158, 1996. Henderson-Sellers, 1996, Object-Oriented Metrics: Measures of Complexity, Prentice Hall. Roger Whitney: Course material. CS 696: Advanced OO. Doc 6, Metrics. Spring Semester, 1997. San Diego State University.http://www.eli.sdsu.edu/courses/spring97/cs696/notes/metrics/metric s.html
TCC and LCC Tight and Loose Class Cohesion The metrics TCC (Tight Class Cohesion) and LCC (Loose Class Cohesion) provide another way to measure the cohesion of a class. The TCC and LCC metrics are closely related to the idea of LCOM4, even though are are some differences. The higher TCC and LCC, the more cohesive and thus better the class. For TCC and LCC we only consider visible methods (whereas the LCOMx metrics considered all methods). A method is visible unless it is Private. A method is visible also if it implements an interface or handles an event. In other respects, we use the same definition for a method as for LCOM4. Which methods are related? Methods a and b are related if: 1. They both access the same class-level variable, or 2. The call trees starting at a and b access the same class-level variable. For the call trees we consider all procedures inside the class, including Private procedures. If a call goes outside the class, we stop following that call branch. When 2 methods are related this way, we call them directly connected. When 2 methods are not directly connected, but they are connected via other methods, we call them indirectly connected. Example: A - B - C are direct connections. A is indirectly connected to C (via B). TCC and LCC definitions NP = maximum number of possible connections= N * (N-1) / 2 where N is the number of methods NDC = number of direct connections (number of edges in the connection graph)NID = number of indirect connections Tight class cohesion TCC = NDC/NPLoose class cohesion LCC = (NDC+NIC)/NP TCC is in the LCC is in the range The higher TCC and LCC, the more cohesive the class is. range 0..1. 0..1. TCC<=LCC.
What are good or bad values? According to the authors, TCC<0.5 and LCC<0.5 are considered non-cohesive classes. LCC=0.8 is considered "quite cohesive". TCC=LCC=1 is a maximally cohesive class: all methods are connected. As the authors Bieman & Kang stated: If a class is designed in ad hoc manner and unrelated components are included in the class, the class represents more than one concept and does not model an entity. A class designed so that it is a model of more than one entity will have more than one group of connections in the class. The cohesion value of such a class is likely to be less than 0.5. LCC tells the overall connectedness. It depends on the number of methods and how they group together.
6
When LCC=1, all the methods in the class are connected, either directly or indirectly. This is the cohesive case. When LCC<1, there are 2 or more unconnected method groups. The class is not cohesive. You may want to review these classes to see why it is so. Methods can be unconnected because they access no class-level variables or because they access totally different variables. When LCC=0, all methods are totally unconnected. This is the non-cohesive case.
TCC tells the "connection density", so to speak (while LCC is only affected by whether the methods are connected at all).
TCC=LCC=1 is the maximally cohesive class where all methods are directly connected to each other. When TCC=LCC<1, all existing connections are direct (even though not all methods are connected). When TCC<LCC, the "connection density" is lower than what it could be in theory. Not all methods are directly connected with each other. For example, A & B are connected through variable x and B & C through variable y. A and C do not share a variable, but they are indirectly connected via B. When TCC=0 (and LCC=0), the class is totally non-cohesive and all the methods are totally unconnected.
This example shows the same class as above. The connections considered are marked with thick violet lines. A and B are connected via variable x. C and D are connected via variable y. E is not connected because its call tree doesn't access any variables. There are 2 direct ("tight") connections. There are no additional indirect connections this time. TCC/LCC readings
In the example on the right, we made C access x to increase cohesion. Now {A, B, C} are directly connected via x. C and D are still connected via y and E stays unconnected. There are 4 direct connections, thus TCC=4/10. The indirect connections are A-D and B-D. Thus, LCC=(4+2)/10=6/10.
Bieman, James M. & Kang, Byung-Kyoo: Cohesion and reuse in an object-oriented system. Proceedings of the 1995 Symposium on Software. Pages: 259 - 262. ISSN:0163-5948. ACM Press New York, NY, USA. http://doi.acm.org/10.1145/211782.211856 The original definition of TCC and LCC.
High LCOM4 means non-cohesive class. LCOM4=1 is best. Higher values are noncohesive. High TCC and LCC means cohesive class. TCC=LCC=1 is best. Lower values are less cohesive. Auxiliary methods (leaf methods that don't access variables) are treated differently. LCOM4 accepts them as cohesive methods. TCC and LCC consider them non-cohesive. See method E in the examples above.
Validity of cohesion Is data cohesion the right kind of cohesion? Should the data and the methods in a class be related? If your answer is yes, these cohesion measures are the right choice for you. If, on the other hand, you don't care about that, you don't need these metrics. There are several ways to design good classes with low cohesion. Here are some examples:
A class groups related methods, not data. If you use classes as a way to group auxiliary procedures that don't work on class-level data, the cohesion is low. This is a viable, cohesive way to code, but not cohesive in the "connected via data" sense. A class groups related methods operating on different data. The methods perform related functionalities, but the cohesion is low as they are not connected via data. A class provides stateless methods in addition to methods operating on data. The stateless methods are not connected to the data methods and cohesion is low. A class provides no data storage. It is a stateless class with minimal cohesion. Such a class could well be written as a standard module, but if you prefer classes instead of modules, the low cohesion is not a problem, but a choice. A class provides storage only. If you use a class as a place to store and retrieve related data, but the class doesn't act on the data, its cohesion can be low. Consider a class that encapsulates 3 variables and provides 3 properties to access each of these 3 variables. Such a class displays low cohesion, even though it is well designed. The class could well be split into 3 small classes, yet this may not make any sense.
Metrics for Object Oriented Software Development By Samudra Gupta Introduction The software development process is no doubt a complicated one. The end product follows a chain of analysis, design, development and testing process. At each stage, it is important to follow a well-defined methodology to ensure a quality end product. For large scale projects,
8
each stage in the whole process is a challenge. In this context, the software design and coding metrics play an important role in ensuring the desired quality. In this article, we would examine couple of important object oriented metrics and see how they can be adopted at design and development stages of a project life cycle to minimize the risk and improve the software quality. Why Metrics In Object Oriented software development process, the system is viewed as collection of objects. The functionality of the application is achieved by interaction among these objects in terms of messages. Whenever, one object depends on another object to do certain functionality, there is a relationship between those two classes. In the light of modern day J2EE like development, it is recommended application software is split into multiple layers. This ensures "separation of concern". With this, we have objects from one layer talking to the objects of another layer. In order to achieve perfect "separation of concern", objects should rely on the interfaces and contracts offered by another object without relying on any underlying implementation details. For example, the application layer depends on Database Access Layer to access data. The Application layer however should never need to know how the data is physically accessed and what the underlying data store is. This is called abstraction. Thus, correct level of abstraction helps build a flexible and scalable application. All said and done in brief, it is not an easy job to reach the correct level of abstraction and the correct relationship between classes. It is better if we can detect any possible faults at an early stage of the design process, so that the design can be corrected in accordance. OO Design metrics can be a very helpful measuring technique to evaluate the design stability. Also, given a correct abstraction of layers and appropriate relationship between the classes, there are still chances that the coding process might introduce a few more vulnerability. This vulnerability is not of defective coding as such but more to do with the internal structure of the code. At this stage also OO metrics can be of help to identify, if we need to pay further attention to any of the code to make it more maintainable. This is what the role of software design and development metrics are. They are used to ensure a better quality and maintainability as a whole. It is also observed that following these metrics make writing test cases easier. Any application that can be tested easily is easier to maintain and debug. Few Object Oriented Metrics We will now discuss a few Object Oriented Metrics and see on what context they can be used and what are the benefits of using these metrics. Cyclomatic Complexity (CC): Cyclomatic complexity is a measure of the complexity of algorithms used in a method. It is in essence a count of number of test cases required to comprehensively test a method. In a graph, the nodes represent the procedural statements (if/else etc.) and the edges represent the transition from one node to the other: then the formula for Cyclomatic Complexity will be: CC = no of edges no of nodes+2
9
Example:
For case 1: The cyclomatic complexity is: 1-2+2 = 1 For case 2: The cyclomatic complexity is : 3-3+2 = 2 The less the complexity, the better it is. More complexity means you have more decision making and branching going one inside the code block. This makes it harder to test the method in comprehensive manner. Wighted Method Per Class (WMC): This is defined as the sum of the complexity of all the methods defined in a class. If all the method complexities are reduced to unity (1), then WMC becomes equal to the number of methods. WMC = sum of cyclomatic complexities of all the methods. Following the discussion of Cyclomatic Complexity, a method with high WMC is not recommended. Response For Class (RFC): The RFC is defined as the total number of methods that can be executed in response to a message to a class. This count includes all the methods available in the whole class hierarchy. If
10
a class is capable of producing a vast number of outcomes in response to a message, it makes testing more difficult for all the possible outcomes. Lack of Cohesion of Methods (LCOM): Cohesion is the degree to which the methods in a class are related to one another and work together to provide a set of behaviors. LCOM measures the degree of similarity of methods by data inputs or the instance variables of the class. To elaborate more, there are two main types of LCOM calculation methods available: LCOM1: Take each pair of methods in the class. If they access different sets of instance variables, increase P by one. If they share at least one variable, increase Q by one. LCOM1 = P - Q , if P > Q LCOM1 = 0 otherwise LCOM1 = 0 indicates a cohesive class. LCOM1 > 0 indicates that the class is not quite cohesive and may need refactoring into two or more classes. Classes with a high LCOM1 can be fault-prone. A high LCOM1 value indicates scatter in the functionality provided by the class. This metric can be used to identify classes that are attempting to provide many different objectives, and consequently are likely to behave in less predictable ways than classes that have lower LCOM1 values. LCOM1 disadvantages LCOM1 suffers from the following drawbacks:
LCOM1 gives a value of zero for very different classes. LCOM1 is not always a valid and most appropriate way to measure cohesiveness of a class. Thats because its definition is based on method-data interaction, which may not be a correct way to define cohesiveness in the object-oriented world. Moreover, very different classes may have an equal LCOM1. LCOM1 is defined on variable access, it's not well suited for classes that internally access their data via properties. A class that has getters/setters for its own internal data may show a high LCOM1. This is not an indication of a problematic class. LCOM1 is not suitable for measuring such classes.
To overcome the problems of LCOM1, two additional metrics have been proposed: LCOM2 and LCOM3. A low value of LCOM2 or LCOM3 indicates high cohesion and a well-designed class. It is likely that the system has good class subdivision implying simplicity and high reusability. A cohesive class will tend to provide a high degree of encapsulation.
11
A higher value of LCOM2 or LCOM3 indicates decreased encapsulation and increased complexity, thereby increasing the likelihood of errors. Which one to choose, LCOM2 or LCOM3? This is a matter of taste. LCOM2 and LCOM3 are similar measures with different formulae. LCOM3 varies in the range [0,1] while LCOM2 is in the range [0,2]. LCOM2>=1 indicates a very problematic class. LCOM3 has no single threshold value. It is a good idea to remove any dead variables before interpreting the values of LCOM2 or LCOM3. Dead variables can lead to high values of LCOM2 and LCOM3, thus leading to wrong interpretations of what should be done. Definitions for LCOM2 and LCOM3
Implementation details. m is equal to WMC. a contains all variables whether Shared or not. All accesses to a variable are counted. LCOM2 LCOM2 is counted as the percentage of methods that do not access a specific attribute averaged over all attributes in the class. LCOM2 = 1 - sum(mA)/(m*a) If the number of methods or variables is zero, LCOM2 is undefined and displayed as zero. LCOM3 (Henderson-Sellers) LCOM3 = (m - sum(mA)/a) / (m-1) LCOM3 varies between 0 and 2. Values 1..2 are considered alarming. In a normal class whose methods access its own variables, LCOM3 varies between 0 (high cohesion) and 1 (no cohesion). A LCOM3 value of 1 indicates high lack of cohesion. When LCOM3=0, each method accesses all variables. This indicates the highest possible cohesion. Coupling between Object Classes (COB):
12
CBO = number of classes to which a class is coupled Two classes are coupled when methods declared in one class use methods or instance variables defined by the other class. Only method calls and variable references are counted. Other types of reference, such as use of constants, calls to API declares, handling of events, use of userdefined types, and object instantiations are ignored. If a method call results in calling (using) other classes, all the classes to which the call can go are included in the coupled count. High coupling between object classes means modules depend on each other too much and will be hard to reuse. The more independent a class is, the easier it is to reuse it in another application. In order to improve modularity and promote reuse, inter-object class couples should be kept to a minimum. The larger the number of couples, the higher the sensitivity to changes in other parts of the design, and therefore maintenance is more difficult. Depth of Inheritance (DIT) Depth of Inheritance is the maximum length from a given class to the root of the inheritance tree. In Java, as all the classes inherit from Object class, the minimum DIT in Java is 1. It is a measure of the depth of the class hierarchy. The higher the value of DIT, child classes inherit more number of methods from the base classes. In such situations, it becomes too difficult to evolve the base classes and child classes. Thus, it is important to keep a low value of DIT in design. Number of Children (NOC) Number of Children is the immediate number of sub-classes to a base class. If we have a very large number of children to a base class, it might be a candidate for refactoring to create a more sustainable and maintainable hierarchy. Conclusion Finally, we will see how we can categorize the above mentioned metrics, into several areas of object oriented design. Area Class Item Method Collaboration Cohesion Coupling Inheritance Metrics WMC RFC LOMC CBO DIT NOC The above table summarizes the areas that the discussed metrics can be applied to. By no means is this is a complete set of metrics that are available. There are plenty more like Lines of Code, Fan-ion/Fan-out etc. I suggest that interested readers go through them all in order to decide which metric is most suitable for his/her project. It is important to mention that
13
depending on the size and complexities of the project, the set of metrics needs to be adjusted to get the right benefit. The decision of which ones to use is mostly down to the organizational practice and experience of individual professionals. Measuring Cohesion Chidamber & Kemerer provide a metric for measuring a module's lack of cohesion (LCOM1). There are several minor improvements on this metric (LCOM2 and LCOM3). The basic idea is to measure how far a class is from information cohesion by measuring the degree to which the methods share the fields. LCOM1 In a method coupling graph for a class C-- MCG(C)-- nodes are methods. Two nodes are connected by an undirected edge if they both reference the same field. In fact, the edge can be labeled by the number of common attributes the methods reference. (a) Draw MCG(Test) where Test is the following class: class int a, void System.out.println(x); System.out.println(y); } void System.out.println(y); System.out.println(z); } void System.out.println(x); System.out.println(z); } void System.out.println(a); } void System.out.println(a); } } Test x, m1() y, { z; {
m2()
m3()
m4() m5()
{ {
LCOM1(C) is the maximum possible number of edges in MGC(C) less the actual number of edges: LCOM1(C) = choose(n, 2) - e where:
14
(c) Assume a class C has 10 methods. What are the possible values of LCOM1(C)? (d) Under what conditions does LCOM1(C) = 0? LCOM2 and LCOM3 In a method-attribute graph for a class C-- MAG(C)-- nodes are methods and attributes (fields). A directed edge connects a method node to an attribute node if the method references the attribute. (e) Draw MAG(Test). If every method of class C references every field of C, then MAG(C) would contain n*a edges, where: n a = # of attributes Assume: e = # of edges in MAG(C) Then e/(n*a) measures C's degree of cohesion. Since this number is <= 1, then 1 minus this number measures C's lack of cohesion: LCOM2(C) = 1 - e/(n * a) (f) Compute LCOM2(Test). Finally: LCOM3(C) = n/(n - 1) * LCOM2(C) = (n - e/a)/(n - 1) (g) Compute LCOM3(Test). (h) Prove 0 <= LCOM2(C) <= 1 (i) Prove 0 <= LCOM3(C) <= 2 (j) What action would you recommend if LCOM3(C) > 1? = # of methods
15
Object-oriented programming has two main objectives: to build highly cohesive classes and to maintain loose coupling between those classes. High-cohesion means well-structured classes and loose coupling means more flexible, extensible software. Applying object-oriented metrics to your design and code can help you determine whether you've achieved these goals. What is Cohesion? In OO methodology, classes contain certain data and exhibit certain behaviours. This concept may seem fairly obvious, but in practice, creating well-defined and cohesive classes can be tricky. Cohesive means that a certain class performs a set of closely related actions. A lack of cohesion, on the other hand, means that a class is performing several unrelated tasks. Though lack of cohesion may never have an impact on the overall functionality of a particular classor of the application itselfthe application software will eventually become unmanageable as more and more behaviours become scattered and end up in wrong places. Thus, one of the main goals of OO design is to come up with classes that are highly cohesive. Luckily, there's a metric to help youverify that you've designed a cohesive class. The LCOM Metric: Lack of Cohesion in Methods The Lack of Cohesion in Methods metric is available in the following three formats: LCOM1: Take each pair of methods in the class and determine the set of fields they each access. If they have disjointed sets of field accesses, the count P increases by one. If they share at least one field access, Q increases by one. After considering each pair of methods: RESULT = (P > Q) ? (P - Q) : 0 A low value indicates high coupling between methods. This also indicates potentially high reusability and good class design. Chidamber and Kemerer provided the definition of this metric in 1993. LCOM2: This is an improved version of LCOM1. Say you define the following items in a class: m: number of methods in a class a: number of attributes in a class. mA: number of methods that access the attribute a. sum(mA): sum of all mA over all the attributes in the class. LCOM2 = 1- sum(mA)/(m*a) If the number of methods or variables in a class is zero (0), LCOM2 is undefined as displayed as zero (0). LCOM3: This is another improvement on LCOM1 and LCOM2 and is proposed by Henderson-Sellers. It is defined as follows: LCOM3 = (m - sum(mA)/a) / (m-1) where m, a, mA, sum(mA) are as defined in LCOM2.
16
The LCOM3 value varies between 0 and 2. LCOM3>1 indicates lack of cohesion and is considered a kind of alarm. If there is only one method in a class, LCOM 3 is undefined and also if there are no attributes in a class LCOM3 is also undefined and displayed as zero (0).
Each of these different measures of LCOM has a unique way to calculate the value of LCOM.
An extreme lack of cohesion such as LCOM3>1 indicates that the particular class should be split into two or more classes. If all the member attributes of a class are only accessed outside of the class and never accessed within the class, LCOM3 will show a high-value. A slightly high value of LCOM means that you can improve the design by either splitting the classes or re-arranging certain methods within a set of classes.
17