Academia.eduAcademia.edu

Fuzzy set-based automatic bug triaging: NIER track

2011, … (ICSE), 2011 33rd …

Assigning a bug to the right developer is a key in reducing the cost, time, and efforts for developers in a bug fixing process. This assignment process is often referred to as bug triaging. In this paper, we propose Bugzie, a novel approach for automatic bug triaging based on fuzzy set-based modeling of bug-fixing expertise of developers. Bugzie considers a system to have multiple technical aspects, each is associated with technical terms. Then, it uses a fuzzy set to represent the developers who are capable/competent of fixing the bugs ...

Fuzzy Set-based Automatic Bug Triaging (NIER Track) Ahmed Tamrawi, Tung Thanh Nguyen, Jafar Al-Kofahi, Tien N. Nguyen Electrical and Computer Engineering Department Iowa State University {atamrawi,tung,jafar,tien}@iastate.edu ABSTRACT Assigning a bug to the right developer is a key in reducing the cost, time, and efforts for developers in a bug fixing process. This assignment process is often referred to as bug triaging. In this paper, we propose Bugzie, a novel approach for automatic bug triaging based on fuzzy set-based modeling of bug-fixing expertise of developers. Bugzie considers a system to have multiple technical aspects, each is associated with technical terms. Then, it uses a fuzzy set to represent the developers who are capable/competent of fixing the bugs relevant to each term. The membership function of a developer in a fuzzy set is calculated via the terms extracted from the bug reports that (s)he has fixed, and the function is updated as new fixed reports are available. For a new bug report, its terms are extracted and corresponding fuzzy sets are union’ed. Potential fixers will be recommended based on their membership scores in the union’ed fuzzy set. Our preliminary results show that Bugzie achieves higher accuracy and efficiency than other state-of-the-art approaches. Categories and Subject Descriptors ID:000002 FixingDate:2002-04-30 16:30:46 EDT AssignedTo:James Moody Summary:Opening repository resources doesn’t honor type. Description:Opening repository resource always open the default text editor and doesn’t honor any mapping between resource types and editors. As a result it is not possible to view the contents of an image (*.gif file) in a sensible way.... Figure 1: Bug report 000002 in Eclipse project ID:006021 FixingDate:2002-05-08 14:50:55 EDT AssignedTo:James Moody Summary:New Repository wizard follows implementation model, not user model. Description:The new CVS Repository Connection wizard’s layout is confusing. This is because it follows the implementation model of the order of fields in the full CVS location path, rather than the user model. Figure 2: Bug report 006021 in Eclipse project D.2.9 [Software Engineering]: Management General Terms Algorithms, Design, Reliability, Management Keywords Bug Triaging, Fuzzy Set 1. INTRODUCTION Bug fixing is crucial in producing high-quality software products. When bug(s) are filed in a bug report, assigning it to the most capable and competent developer is important in reducing the cost and time in a bug fixing process [5]. This assignment process is referred to as bug triaging [1]. To help developers in this task, we propose Bugzie, a novel Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICSE ’11, May 21–28, 2011, Waikiki, Honolulu, HI, USA Copyright 2011 ACM 978-1-4503-0445-0/11/05 ...$10.00. automatic bug triaging approach that models the bug-fixing tendency/expertise of developers with respect to technical aspects in a project based on their past fixing activity via fuzzy set theory [8], and then leverages such information to recommend most potential fixers for a new bug report. Let us start with a motivating example on real-world bug reports. Figure 1 depicts a bug report from Eclipse project, with the relevant fields including a unique identification number of the report (ID), the fixing date (FixingDate), the fixing developer (AssignedTo), a short summary (Summary), and a full description (Description) of the bug. The bug report describes an issue that the system always used its default editor to open any resource file (e.g. a GIF file) regardless of its file type. Analyzing the description, we found this report is related to a technical aspect in Eclipse, that is, version control and management (VCM) of software artifacts. The concept of VCM could be recognized in the report’s content via its descriptive terms such as repository, resource, or editor. This technical function can be considered as project-specific since not all systems have it. Checking the corresponding fixed code in Eclipse, we found that the bug occurred in the code implementing an operation of VCM: opening a resource file in the repository. The bug was assigned to and fixed by a developer named James Moody. Searching and analyzing several other Eclipse’s bug re- ports, we found that James also fixed other VCM-related bugs, for example, the one in the report #6021 (Figure 2). That bug is related to VCM in which the function of repository connecting was not properly implemented. Thus, James Moody probably has the expertise, knowledge, or capability with respect to fixing the VCM-related bugs in the project. Implications. The example suggests us the following implications for our approach: 1. A software system has several technical aspects. Each aspect is expressed via technical terms. A bug report is related to one or multiple technical aspects. 2. If a developer frequently fixes the bugs related to a technical aspect, we could consider him to have bug-fixing expertise/capability on that aspect, i.e., he could be a capable/competent fixer for a future bug related to that aspect. We could determine the capable developers for the technical aspects in the system based on their past fixing activities. When a new bug is filed, we recommend the developers who are most capable of fixing bugs in the corresponding aspects. 2. APPROACH There are two key research questions in Bugzie: 1) how to represent technical aspects of a system from software artifacts (e.g. bug reports), and 2) given a bug report, how to determine who have the bug-fixing capability/expertise with respect to the reported technical aspect(s). We consider a technical aspect as a collection of technical terms that are extracted directly from the software artifacts in a project, and more specifically from bug reports. For the second research question, we utilize the fuzzy set theory. We use a fuzzy set Ct to represent the set of developers who have the bug-fixing expertise relevant to a specific technical term t, that is, the set of developers who are the most competent to fix the bugs relevant to the term t. The determination if a developer belongs to that fuzzy set is made via the occurrences of the terms in the bug reports that he has fixed. For a new bug report B with one or multiple technical aspects, the set CB of capable developers toward B is modeled by a fuzzy set that is the union set of all fuzzy sets (over developers) corresponding to all terms associated with B. Our algorithm has three main stages: 1) training: building fuzzy set Ct for each term t from available artifacts (e.g. fixed bug reports); 2) recommending: for a given unfixed bug report B, recommending a ranked list of developers capable of fixing it; and 3) updating the fuzzy sets when new information (e.g. new fixed bug reports) is available. 2.1 Training In Bugzie, a fuzzy set Ct is determined via a membership function µt with values in the range of [0,1]. For a developer d, µt (d) determines how likely d belongs to the fuzzy set Ct , i.e. the degree to which d is capable of fixing the bug(s) relevant to t. We calculate µt (d) based on the correlation between the set Dd of bug reports d has fixed, and the set Dt of bug reports containing term t. µt (d) = |Dd ∩ Dt | nd,t = |Dd ∪ Dt | nt + nd − nd,t In this formula, nd , nt , and nd,t are the number of bug reports that d has fixed, the number of reports containing the term t, and that with both, respectively (counted from the available training data, i.e. given fixed bug reports). The formula means that, if the more frequently a term t appears in the reports that developer d has fixed, the more likely that developer d has fixing expertise toward the technical aspects associated with t. The higher µt (d) is, the higher degree that d is a capable fixer for the bugs relevant to term t. The value of µt (d) ∈ [0, 1]. If µt (d) = 1, then only d had fixed the bug reports containing t, thus, d is highly capable of fixing the bugs relevant to the aspects associated with term t. If µt (d) = 0, d has never fixed any bug report containing t, thus, might not be the right fixer with respect to t. The membership values within the interval [0,1] indicates the marginal elements of the class of developers defined by a term. Thus, the membership in a fuzzy set is an intrinsically gradual notion, instead of concrete as in conventional logic. That is, the boundary for the set of developers who are capable of fixing the bug(s) relevant to a term t is fuzzy. 2.2 Recommending In this step, Bugzie recommends the most capable developers for each given unfixed bug report B. Since B reports on one or more technical aspects and those aspects could be recognized via the technical terms extracted from B, we consider the set of capable developers for B is a fuzzy union set CB of all fuzzy sets corresponding to all the terms in B. ∪ Ct CB = t∈B According to fuzzy set theory [8], the membership function of CB is calculated as the following: ∏ (1 − µt (d)) µB (d) = 1 − t∈B It could be seen that µB (d) is also within [0,1] and, by fuzzy set theory, it represents the degree in which developer d belongs to the set of capable fixers for the bug(s) reported in B. The value µB (d) = 0 when all µt (d) = 0, i.e. d has never fixed any report containing any term in B. Thus, Bugzie considers that d might not be as suitable as others in fixing technical issues reported in B. Otherwise, if there is a term with µt (d) = 1, then µB (d) = 1 and d is considered as the capable developer (since only d has fixed bug reports with term t before). In general cases, the more terms in B have high µt (d) scores, the higher µB (d) is, i.e. the more likely d is a capable fixer for bug report B. After calculating µB (d) for all available developers, Bugzie ranks them based on those membership values and recommends the top-n developers as the ones who should fix the bug(s) reported in B. 2.3 Updating When new information is available (e.g. new bug reports are fixed by some developers), Bugzie updates its training data by updating all existing fuzzy sets Ct and creating new sets for new terms. The update process can be done incrementally. As we could see, Ct is defined via the membership values µt (d)s, and µt (d) is calculated via nd , nt , and nd,t . Therefore, Bugzie stores only the values nd , nt , and nd,t , and updates them when there are newly available fixed bug reports by adding new corresponding counts for the new data. Specifically, if a new term (or a new developer) appears in new data, Bugzie just creates new counting numbers nt (or nd ) and nd,t s. If a developer has a new fixing activity, Bugzie just updates the corresponding counting numbers nd and nd,t s. For example, the number of reports fixed by d is updated with the number of new fixed reports from d: nd := nd + n′d . Other derivative values such as µt (d) and µB (d) are calculated from those counts on demand. This makes our incremental training algorithm very efficient in comparison with other modeling/learning techniques. Importantly, it fits well with the evolutionary nature of a software system and a software development process. Table 1: Prediction Accuracy Result (%) Approach Naı̈ve Bayes Bayesian Network C4.5 (Decision Trees) SVM Inc Naı̈ve Bayes Inc Bayesian Network Fuzzy Set (this paper) Top-1 23.68 12.20 18.68 27.38 25.86 14.06 37.81 Top-2 33.72 18.03 23.97 38.53 36.39 20.91 52.11 Top-3 39.76 22.17 24.86 45.26 42.49 25.52 59.70 Top-4 43.88 25.50 25.10 49.78 46.61 29.01 64.52 Top-5 47.05 27.88 25.14 53.02 49.78 31.86 68.00 3. EVALUATION 3.1 Experiment Setup We conducted a preliminary evaluation on Eclipse project, which has been used in evaluating the existing state-of-theart approaches [1, 3, 7]. From Eclipse’s bug tracking repository [6], we collected 69,829 bug reports that have been filed and fixed from January 2008 to November 2010. For each bug report, we extracted its unique ID, the actual fixing developer’s ID, short summary, and full description. There are in total 1,510 fixing developers for those bug reports. We merged the summary and description of each bug report, extracted their terms and preprocessed them, such as stemming for term normalization and removing grammatical and stop words. Finally, we had a total of 103,690 terms. We used the same longitudinal experiment setup as in [3], simulating the usage of our tool in reality. That is, all bug reports are sorted in the chronological order, and then divided into 11 non-overlapped and equally sized frames. Each frame is indexed corresponding to their creation time. Initially, frame 0 with its bug reports are used for training only. Then, Bugzie uses that training data to recommend for the first 100 bug reports in frame 1. Bugzie gives a top list of T developers recommended to fix each of those 100 bug reports. If the recommendation list for a bug report B contains its actual fixer, we count this as a hit (i.e. a correct recommendation). After that, we update the counts for the fuzzy sets with the tested 100 bug reports and move to the next 100 bug reports in the same frame. After completing frame 1, the updated training data is then used to test frame 2 in the same manner. For each frame under test, we use the prior frames for training, and calculate the prediction accuracy as in [3], i.e. the ratio between the number of hits over the total prediction cases. We then calculate the average value on all 10 frames. An average value is calculated for each selection of the top-ranked list of T from 1-5. We repeat for the remaining frames. For the comparison purpose, we also used Weka [12] to re-implement the existing state-of-the-art approaches [1, 7, 3] with the same experimental setup and with the descriptions of their approaches in their papers. We calculated the prediction accuracy and measured time efficiency, which is the total time of training, updating, and recommending. Table 2: Time Efficiency Comparison (hh:mm:ss) Approach Naı̈ve Bayes Bayesian Network C4.5 (Decision Trees) SVM Incremental Naı̈ve Bayes Incremental Bayesian Network Fuzzy Set (this paper) Training 08:49:17 15:21:38 129:17:37 06:01:57 36:52:43 53:47:42 03:52:57 Recommendation 131:33:04 180:26:58 00:05:58 11:46:09 129:15:49 190:16:47 00:00:22 top-1 recommendation (i.e. recommending only one fixer for each bug report under test), Bugzie has a prediction accuracy of 37.81% on average, i.e. on average in 37.81% of the cases, it correctly recommends the developer who actually fixed the bug(s). For top-5 recommendation, it has accuracy 68%, i.e. in 68% of the cases, the actual fixer was in its top-5 recommended list. Other machine-learning approaches reach the maximum accuracy of 53.02% at their top-5 recommendations. Importantly, Bugzie is also more time efficient than those approaches. Table 2 shows total time in hours spent in training and recommendation for different approaches. As shown, the training time of Bugzie (4 hours) is smaller than that of other approaches. The corresponding time of the second fastest approach is about 6 hours. For the case of C4.5, our data set is too large for Weka to run on all 11 frames. The value in Table 2 at C4.5 line is only for 4 frames. The recommendation time is much smaller because Bugzie just needs to compute µB (d) (Section 2.2) and ranks developers based on their scores. That is, Bugzie is more efficient in computation, while other approaches that use machine-learning techniques may not scale well for very large data sets. 3.2 Results Table 1 shows the results from different approaches. Anvik et al. [1] employed SVM, Naive Bayes, and C4.5’s classifiers. Bhattacharya and Neamtiu [3] used Naive Bayes and Bayesian network with and without incremental learning. Cubranic and Murphy [5] used Naive Bayes. Figure 3 displays the accuracy comparison of Bugzie with others when the recommended list contains 1 to 5 top-ranked developers. As seen, Bugzie outperforms other approaches both in term of prediction accuracy and time efficiency. For Figure 3: Prediction Accuracy Comparison 4. RELATED WORK There are several approaches that apply machine learning (ML) and/or information retrieval (IR) to (semi-)automate the process of bug triaging. The first approach along that line is from Cubranic and Murphy [5]. From the titles and descriptions of bug reports, keywords and developers’ IDs are extracted and used to build a text classifier using Naive Bayes technique. Their classifier will recommend potential fixers based on the classification of a new report. Their prediction accuracy is up to 30% on an Eclipse’s bug report data set from Jan to Sep-2002. Anvik et al. [1] also follow similar ML approach and improve Cubranic et al.’s work by filtering out invalid data such as unfixed bug reports, no-longer-working or inactive developers. With 3 different classifiers using SVM, Naive Bayes, and C4.5, they achieved a precision of up to 64%. In contrast to the ML techniques in those approaches, our fuzzy set approach has higher computational efficiency in incremental data training. Moreover, it is able to naturally provide a ranked list of potential fixers, while the outcome of a classifier in their approaches has the assignment of a bug report to one specific developer. Another related approach is from Bhattacharya and Neamtiu [3]. Similar to Bugzie, their model is capable of incremental learning. However, in contrast to fuzzy set approach in Bugzie, they use a ML approach with Naive Bayes and Bayesian network to build classifiers for keywords extracted in reports. Therefore, Bugzie has better time efficiency and a more natural ranking scheme than their ML classifiers. Moreover, as seen in Section 3, Bugzie outperformed their (incremental) Naive Bayes and Bayesian classifiers. The idea of bug tossing graphs was first introduced by Jeong et al. [7] in which their Markov-based model learns the patterns of bug tossing from developers to developers after a bug was assigned in the past, and it uses such knowledge to improve bug triaging. Their goal is more toward reducing the lengths of bug tossing paths [7], rather than addressing the question of who should fix a particular bug as in an initial assignment. We can combine fuzzy set approach with the use of bug tossing graphs to further improve our accuracy. Lin et al. [9] use a ML approach with SVM and C4.5 classifiers on both textual data and non-text fields (e.g. bug type, priority, submitter, phase and module IDs). Executing on a proprietary project with 2,576 bug records, their models achieve the accuracy of up to 77.64%. The accuracy is 63% if module IDs were not considered. Bugzie has higher accuracy and could integrate non-text fields for further improvement. Other researchers use IR for automatic bug triaging. Canfora and Cerulo [4] use the terms of fixed change requests to index source files and developers, and then query them as a new change request comes in order to automate bug triaging. However, the accuracy was not very good (10-20% on Mozilla and 30-50% on KDE). Their indexing scheme does not support incremental learning and probability. Matter et al. [10] introduce Develect, a model for developers’ expertise by extracting terms in their contributed code. Then, a developer’s expertise is represented by a vector of frequencies of terms appearing in his source files. The vector for a new bug report is compared with the vectors for developers for bug triaging. Testing on 130,769 bug reports in Eclipse, the accuracy is not as high as Bugzie (up to 71% with top-10 recommendation list, respectively). While Develect is based on vector-based model (VSM), a deterministic traditional IR method, Bugzie models developers’ with fuzzy sets, enabling more flexible computation and modeling of developers’ bug-fixing expertise, as well as enabling incremental learning for time efficiency. For example, in Develect, the length of a vector representing a developer’s expertise must cover all terms occurring in the data set. With the fuzzy set nature, in Bugzie, thresholds are chosen to be more selective in a set of terms for one developer. Moreover, with evolving software (new developers and terms), VSM must recompute entire vector set. Baysal et al. [2] proposed to enhance VSM in modeling developers’ expertise with preference elicitation and task allocation. Rahman et al. [11] measure the quality of assignment by the match between requested (from bug reports) and available (from developers) competence profile. In brief, existing ML-based classification approaches [1, 3, 9] characterize the classes of bugs that each developer is capable of, and then classify a new bug report based on that classification for bug triaging. Other approaches aim to profile developers’ expertise via terms in past fixing bug reports, and match a new report with such profiles [2, 10]. 5. CONCLUSIONS In this paper, we propose Bugzie, a new fuzzy set-based approach for automatic bug triaging. Fuzzy sets are used to represent the sets of capable developers of fixing the bugs related to individual technical aspects via technical terms. Such fuzzy sets are computed for each term in a new bug report and then are union’ed to find capable developers for the report. Preliminary evaluation shows that Bugzie achieves higher accuracy and efficiency than existing approaches. Acknowledgment. This project is funded by NSF CCF1018600 award. 6. REFERENCES [1] J. Anvik, L. Hiew, and G. C. Murphy. Who should fix this bug? In ICSE ’06, pages 361–370. ACM, 2006. [2] O. Baysal, M. W. Godfrey, and R. Cohen. A bug you like: A framework for automated assignment of bugs. In ICPC’09, pages 297-298. IEEE CS, 2009. [3] P. Bhattacharya and I. Neamtiu. Fine-grained incremental learning and multi-feature tossing graphs to improve bug triaging. In ICSM’10. IEEE CS, 2010. [4] G. Canfora and L. Cerulo. Supporting change request assignment in open source development. In SAC’06: ACM symposium on Applied computing, ACM Press. [5] D. Cubranic and G. Murphy. Automatic bug triage using text categorization. In SEKE’04. KSI Press. [6] Eclipse bugzilla repository. bugs.eclipse.org/bugs/. [7] G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In FSE’09, ACM. [8] G.J. Klir and Bo Yuan. Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, 1995. [9] Z. Lin, F. Shu, Y. Yang, C. Hu, and Q. Wang. An empirical study on bug assignment automation using Chinese bug data. In ESEM’09. IEEE CS, 2009. [10] D. Matter, A. Kuhn, and O. Nierstrasz. Assigning bug reports using a vocabulary-based expertise model of developers. In MSR’09, pp. 131–140. IEEE CS, 2009. [11] M. Rahman, G. Ruhe, T. Zimmermann. Optimized assignment of developers for fixing bugs: an initial evaluation for Eclipse projects. In ESEM’09, IEEE CS. [12] Weka. http://www.cs.waikato.ac.nz/ml/weka/.