Dont Touch My Code
Dont Touch My Code
Dont Touch My Code
ABSTRACT our knowledge, the effect of ownership has not been stud-
Ownership is a key aspect of large-scale software develop- ied in depth in commercial contexts. Based on our observa-
ment. We examine the relationship between different own- tions and discussions with project managers, we suspect that
ership measures and software failures in two large software when there is no clear point of contact and the contributions
projects: Windows Vista and Windows 7. We find that to a software component are spread across many developers,
in all cases, measures of ownership such as the number of there is an increased chance of communication breakdowns,
low-expertise developers, and the proportion of ownership misaligned goals, inconsistent interfaces and semantics, all
for the top owner have a relationship with both pre-release leading to lower quality.
faults and post-release failures. We also empirically iden- Interestingly, unlike some aspects of software which are
tify reasons that low-expertise developers make changes to known to be related to defects such as dependency com-
components and show that the removal of low-expertise con- plexity, or size, ownership is something that can be delib-
tributions dramatically decreases the performance of contri- erately changed by modifying processes and policies. Thus,
bution based defect prediction. Finally we provide recom- the answer to the question: “How much does ownership af-
mendations for source code change policies and utilization fect quality?” is important as it is actionable. Managers and
of resources such as code inspections based on our results. team leads can make better decisions about how to govern
a project by knowing the answer. If ownership has a big
effect, then policies to enforce strong code ownership can be
Categories and Subject Descriptors put into place; managers can also watch out for code which
D.2.8 [Software Engineering]: Metrics—Process metrics is contributed by developers who have inadequate relevant
prior experience. If ownership has little effect, then the nor-
mal bottlenecks associated with having one person in charge
General Terms of each component can be removed, and available talent re-
Measurement, Management, Human Factors assigned at will.
We have observed that many industrial projects encour-
Keywords age high levels of code ownership. In this paper, we examine
ownership and software quality. We make the following con-
Empirical Software Engineering, Ownership, Expertise, Qual- tributions in this paper:
ity
1. We define and validate measures of ownership that are
1. INTRODUCTION related to software quality.
Many recent studies [6, 9, 26, 29] have shown that hu- 2. We present an in depth quantitative study of the effect
man factors play a significant role in the quality of software of these measures of ownership on pre-release and post-
components. Ownership is a general term used to describe release defects for multiple large software projects.
whether one person has responsibility for a software com-
ponent, or if there is no one clearly responsible developer. 3. We identify reasons that components have many low-
Within Microsoft, we have found that when more people expertise developers contributing to them.
work on a binary, it has more failures [5, 26]. However, to 4. We propose recommendations for dealing with the ef-
fects of low ownership.
Permission to make digital or hard copies of all or part of this work for
2. THEORY & RELATED WORK
personal or classroom use is granted without fee provided that copies are A number of prior studies have examined the effect of
not made or distributed for profit or commercial advantage and that copies developer contribution behavior on software quality.
bear this notice and the full citation on the first page. To copy otherwise, to Rahman & Devanbu [30] examined the effects of owner-
republish, to post on servers or to redistribute to lists, requires prior specific ship & experience on quality in several open-source projects,
permission and/or a fee.
ESEC/FSE’11, September 5–9, 2011, Szeged, Hungary. using a fine-grained approach based on fix-inducing frag-
Copyright 2011 ACM 978-1-4503-0443-6/11/09 ...$10.00. ments of code, and report findings similar to those of our
paper. However, they operationalize ownership differently, experience of a developer history (by counting prior changes)
and ownership policies and practices in OSS and commercial and they were significant in prediction.
software are quite different. Thus the similarity of effect is In a study of offshoring and succession in software devel-
striking. Furthermore, Rahman & Devanbu do not study the opment [21], Mockus evaluated a number of succession mea-
relationship of minor contribution on software dependencies; sures with the goal of being able to automatically identify
nor do they consider social network measures. mentors for developers working on a per-component basis.
Weyuker et al. [35], examined the effect of including team A succession measure based on ownership was able to accu-
size in prediction models. They use a count of the develop- rately pinpoint the most likely method and was used in a
ers that worked on each component, but do not examine the large scale study evaluating the factors affecting productiv-
proportion of work, which we account for. They found a neg- ity in project succession and offshoring.
ligible increase in failure prediction accuracy when adding Research in other domains, such as manufacturing, has
team size to their models. We differ in that we examine found that when a worker performs a task repeatedly, the
the proportion of contributions made by each developer to labor requirements to complete subsequent work in the same
a component. Further, we are not interested in prediction, task decreases and the quality increases [11]. Software de-
but rather determining if there is a statistically significant velopment differs from these domains in that workers do not
relationship between ownership and failures. perform the exact same task repeatedly. Rather, software
Similarly, Meneely and Williams examined the relation- development represents a form of constant problem solving
ship of the number of developers working on parts of the in which tasks are rarely exactly the same, but may be sim-
Linux kernel with security vulnerabilities [19]. They found ilar. Nonetheless, developers gain project and component
that when more than nine developers contribute to a source specific knowledge as they repeatedly perform tasks on the
file, it is sixteen times more likely to include a security vul- same systems [32]. Banker et al. found that increased ex-
nerability. perience increases a developer’s knowledge of the architec-
New methods such as Extreme Programming (XP) [4] pro- tural domain of the system [1]. Repeatedly using a particu-
fess collective code ownership but there has been little em- lar API, or working on a particular system creates episodic
pirical evidence or backing of this data on reasonably ma- knowledge. Robillard indicates that the lack of such knowl-
ture/complex or large systems. Our study is the first to em- edge negatively affects the quality of software [31]. Indeed,
pirically quantify the effect code owners (and low-expertise Basili and Caldiera present an approach for improving qual-
contributors) have on the overall code quality. ity in software development through learning and experience
Domain, application, and even component-specific knowl- by establishing “experience factories” [2]. They claim that
edge are important aids for helping developers to maintin by reusing knowledge, products, and experience, companies
high quality software. Boh et al. found that project specific can maintain high quality levels because developers do not
expertise has a much larger impact on the time required to need to constantly acquire new knowledge and expertise as
perform development tasks than high levels of diverse expe- they work on different projects. Drawing on these ideas,
rience in unrelated projects [7]. In a qualitative study of 17 we develop ownership measures which consider the number
commercial software projects, Curtis et al. [10] found that of times that a developer works on a particular component,
“the thin spread of application domain knowledge” was one with the idea that each exposure is a learning experience
of the top three salient problems. They also found that one and increases the developer’s knowledge and abilities.
common trait among engineers categorized as “exceptional” there is a knowledge-sharing factor at play as well. The
was that they had deep domain knowledge, and understood set of developers that contribute to a component implicitly
how the system design would generate the system behavior form a team that has shared knowledge regarding the seman-
customers expected, even under exceptional circumstances. tics and design of the component. Coordination is a known
Such knowledge is not easily obtained. One systems engi- problem in software development [16]. In fact, another of
neer explained, “Someone had to spend a hundred million to the top three problems identified in Curtis’ study [10] was
put that knowledge in my head. It didn’t come free.” “communication and coordination breakdowns.” Working
The question naturally arises, how can we determine who in such a group always creates a need for sharing and in-
has such domain knowledge? Fortunately, there is a wealth tegrating knowledge across all members [8]. Cataldo et al.
of literature that uses the prior development activity on a showed that communication breakdowns delay tasks [9]. If
component as a proxy for expertise and knowledge with re- a member of this team devotes little attention to the team
spect to the component. As examples Expertise Browser and/or the component, they may not acquire the knowledge
from Mockus et al. [22] and Expertise Recommender from required to make changes to the component without error.
McDonald and Ackerman [18] both use measures of the We attempt to operationalize these team members in this
amount of work that a developer has performed on a soft- paper and examine their effect on quality.
ware component to recommend component experts. Fritz If ownership of a particular component in a system (whether
et al. found that the ability of a developer to answer ques- it be a file, class, module, plugin, or subsystem) is a valid
tions about a piece of code in a system was strongly deter- proxy for expertise, then what is the effect of having most
mined by whether the developer had authored some of the changes made by those with little expertise? Is it better
code, and how much time was spent authoring it [15]. to have one clear owner of a software component? We op-
Mockus and Weiss used properties of an individual change erationalize ownership in two key ways here and formally
to predict the probability of that change causing a fail- define our measures in section 3. One measure of ownership
ure [23]. They found that changes made by developers that is how much of the development activity for a component
were more experienced with a piece of code were less likely to comes from one developer. If one developer makes 80% of
induce failure. Three of their fourteen measures capture the the changes to a component, then we say that the compo-
nent has high ownership. The other way that we measure
• F423-$)%')!"+%$)*%+($"34(%$&)G!-./0H)
• F423-$)%')!.D%$)*%+($"34(%$&)G!12/0H)
• ;%(./)F423-$)%')*%+($"34(%$&)G3/314H)
• I$%5%$("%+)%')J6+-$&7"5)'%$)(7-)#%+($"34(%$)(7.()2.K-&)(7-)2%&()#%22"(&)G/5.6078-9H)
)
!"#$%&'(')'*%+,-'./',%.,.%0".1'./'2.33"04'0.'+5.2.3,6788'59'7&:&8.,&%4'7$%"1#';"40+'7&:&8.,3&10'2928&6'
Figure 1: Graph of the proportion of commits to abocamp.dll by developers during the Vista development cycle, showing the
four measures of ownership used in
L"?4$-) A) &7%6&) (7-)this paper. %') #%22"(&) '%$) -.#7) %') (7-) ,->-/%5-$&) (7.() #%+($"34(-,) (%)
5$%5%$("%+)
abocomp.dll "+)E"+,%6&)M"&(.=)"+),-#$-.&"+?)%$,-$:));7"&)/"3$.$C)7.,).)(%(./)%')NAO)#%22"(&)2.,-)
ownership is by determining how many low-expertise de- be traced back to a specific component and software
,4$"+?) (7-) ,->-/%52-+() #C#/-:) ) ;7-) (%5) #%+($"34("+?) -+?"+--$) 2.,-) PQN) #%22"(&=) $%4?7/C) RA9:) ) L">-)
velopers are working on a component. If many developers changes from developers can also be traced to a compo-
are all making few-+?"+--$&)2.,-).()/-.&()89)%')(7-)#%22"(&)G.()/-.&()RS)#%22"(&H:));6-/>-)-+?"+--$&)2.,-)/-&&)(7.+)89)
changes to a component, then there are nent. In Windows, a component is a compiled binary.
%') (7-) #%22"(&) G/-&&) (7.+) RS) #%22"(&H:) ) L"+.//C=) (7-$-) 6-$-) .) (%(./) %') &->-+(--+) -+?"+--$&) (7.() 2.,-)
many non-experts working on the component and we label
#%22"(&)(%)(7-)3"+.$C:));74&=)%4$)2-($"#&)'%$).3%#%25:,//).$-T) • Contributor – A contributor to a software com-
the component as having low ownership.
ponent is someone who has made commits/software
We expect that having one clear “owner” for a compo- <&0%"2' ;+8$&'changes to the component.
nent will lead to fewer failures and that when many !UFJV)non- 8)
experts are making changes, indicating that ownership !1WJV)is A@)• Proportion of Ownership – The proportion of own-
spread across many contributors, the component will have ership (or simply ownership) of a contributor for a
more failures. particular component is the ratio of number of com-
!"#$%&%'()*%+'",-+("./) mits that the contributor has made relative to the to-
tal number of commits for that component. Thus, if
3. TERMINOLOGY AND METRICS Cindy has made 20 commits to ie9.dll and there are
We adopt Basili’s goal question metric approach [3] to a total of 100 commits to ie9.dll then Cindy has an
frame our study of ownership. Our goal is to understand ownership of 20%.
the relationship between ownership and software quality. We
also hope to gain an understanding of how this relationship • Minor Contributor – A developer who has made
varies with the development process in use. Achievement of changes to a component, but whose ownership is below
this goal can lead to more informed development decisions 5% is considered a minor contributor to that compo-
or possibly process policy changes resulting in software with nent. This threshold was chosen based on examination
fewer defects. of distributions of ownership1 . We refer to a commit
In order to reach this goal, we ask a number of specific from a minor contributor as a minor contribution.
questions:
• Major Contributor – A developer who has made
1. Are higher levels of ownership associated with less de- changes to a component and whose ownership is at or
fects? above 5% is a major contributor to the component and
a commit from such a developer is a major contribu-
2. Is there a negative effect when a software entity is de-
tion.
veloped by many people with low ownership?
3. Are these effects related to the development process Note that we examine the number of changes to a compo-
used? nent made by a developer rather than the actual number of
lines modified. Within Windows, each change corresponds
In order to answer these questions, we propose a number of to one fix or enhancement and individual changes are quite
ownership metrics and use them to evaluate our hypotheses small, usually on the order of tens of lines. We use number
of ownership. We begin by defining some important terms of changes because each change represents an “exposure” of
and metrics used throughout the rest of this paper: the developer to the code and because the previous measure
• Software Component – This is a unit of develop- 1
A sensitivity analysis with threshold values ranging from
ment that has some core functionality. Defects can 2% to 10% yielded similar results.
Ownership of A.dll by Developers Ownership of B.dll by Developers
of experience used by Mockus and Weiss also used the num- Hypothesis 1 - Software components with many minor con-
ber of changes. However, prior literature [14] has shown high tributors will have more failures than software components
correlation (above 0.9) between number of changes and num- that have fewer.
ber of lines contributed and we have found similar results in
Windows, indicating that our results would not change sig- We also look at the proportion of ownership for the highest
nificantly. With these terms defined, we now introduce our contributing developer for each component (Ownership). If
metrics. Ownership is high, that indicates that there is one devel-
oper who “owns” the component and has a high level of ex-
• Minor – number of minor contributors pertise. This person can also act as a single point of contact
for others who need to use the component, need changes to
• Major – number of major contributors it, or just have questions about it. We theorize that when
such a person exists, the software quality is higher resulting
• Total – total number of contributors in fewer failures.
Table 1: Bivariate Spearman correlation of ownership and code metrics with pre- and post-release failures in Windows Vista
and Windows 7. All correlations are statistically significant except for that of Ownership and post-release failures in Windows
7.
Table 2: Variance in failures for the base model which includes standard metrics of complexity, size, and churn, as well as the
models with Minor and Ownership added. An asterisk∗ denotes that a model showed statistically significant improvement
when the additional variable was added.
of one or more variables on an outcome when controlling for which explains 46%. the Base+Minor model explains 20%
other variables. We use it for this purpose in an effort to more of the variance in pre-release failures than the Base
examine the relationship of our ownership measures when model. Adding an independent variable to a model can never
controlling for source code characteristics such as size, com- decrease the variance explained, so we use the adjusted R2
plexity, and churn. measure which penalizes models that have additional vari-
A linear regression model for failures indicates which vari- ables.
ables have an effect on failures, how large the effect is, in We built five statistical models of failures for pre- and
what direction (i.e. if failures go up when a metric goes post-release defects in Windows Vista and Windows 7 (sum-
up or when it goes down), and how much of the variance marized in Table 2). The first model contains only the clas-
in the number of failures is explained by the metrics. We sical source code metrics: size, complexity, and churn. We
compare the amount of variance in failures explained by a refer to this as the base model. This model showed that
model that includes the ownership metrics to a model that churn, size, and complexity all have a statistically signifi-
does not include them. There are many measures of churn, cant effect on both pre and post-release failures. In addi-
complexity, and size. However, to avoid multi-collinearity tion, these metrics are able to explain 26% of the variance
and over-fitting, we include only one of each measure in the in pre-release failures and 29% of the variance in post-release
model; We choose the measure which results in the best failures in Vista and 24% and 28% in Windows 7.
base model. This gives an indication of how much own- In the second model, we added Total to the classic vari-
ership actually affects software failures. We examined the ables. This examines the effect of team size on defects and
improvement in amount of variance in failures explained by does not include any measures of the proportion of contribu-
the metrics (commonly referred to as the adjusted R2 ) and tions made by individual members. All models exhibitted a
examine improved goodness of fit using F-tests to determine statistically significant improvement in variance explained.
if the addition of an ownership metric improves the model Next, we added Minor to the set of predictor variables
by a statistically significant degree [12]. in the base model. This was done to determine if the total
Linear regression models can be reliably interpreted if cer- number of developers has a different effect on quality than
tain assumptions hold. Two key assumptions are that the the number of minor contributors. The statistics showed
residuals are normally distributed, and not correlated with that Minor is positively related to both pre and post-release
any of the independent variables. In our analysis, we found failures to a statistically significant degree. The addition of
that the distribution of failures was almost always heavily Minor increased the proportion of variance in pre-release
right skewed, which led to a similar skew in the residuals. failures to 46% and post-release failures to 41%. The gains
When we transformed the dependent variable to be the log shown by Minor were stronger than those shown by Total
of the number of failures, the skew diminished, and the resid- for both types of failures to a statistically significant degree,
uals fit the normality assumption. This data transformation in all cases except for post-release failures in Windows 7,
was applied to all dependent variables except for post-release indicating that Minor has a larger effect on failures.
failures in Vista, where linear regression assumptions were The addition of Major and Ownership showed smaller
met by the raw data. gains, but were often still statistically significant. We found
similar results regardless of the order that these variables
were added to the models. Ownership was found to have
6. RESULTS a negative relationship with failures to a statistically signifi-
We now present the results of our analysis of Windows cant degree and Major had a positive relationship, but was
Vista and Windows 7. Table 2 illustrates the results of much smaller than Minor. Minor still showed more of an
our analysis. We denote with an asterisk∗ , cases where a effect than Ownership and Major even when it was added
goodness-of-fit F-test indicated that the addition of a vari- last (not shown). The final models account for up to 72% of
able improved the model by a statistically significant degree. variance in failures. In all cases, ownership had a stronger
The value in parentheses indicates the percent increase in relationship with pre-release failures than post-release fail-
variance explained over the model without the added vari- ures and the models in general were less explanatory. This
able. For example, in Table 2 the Base+Minor +Major may indicate that there are already measures being taken
model in Vista explains 48% of the variance in pre-release (e.g. increased testing, more stringent quality controls, etc.)
failures which is 2% more than the Base+Minor model
between implementation completion and release to counter-
act the effects of poor ownership.
Major-Minor-Dependency Relationship
For all metrics that measure ownership levels there is a
clear trend of having a statistically significant relationship
to failures in Windows. In all cases, Major and Ownership Foo.exe
show less of an effect than Minor or Total, indicating that jor
the number of higher-expertise contributors has marginal Ma utor
trib
effect on quality, although the results are still statistically
significant.
Con Dependency
The results of our analysis of ownership in both releases
Min
Cont or
of Windows can be interpreted as follows:
2. Higher levels of ownership for the top contributor to Figure 3: Illustration of the major-minor-dependency rela-
a component results in fewer failures when controlling tionship commonly observed in Vista
for the sameAdd graph
metrics, rewiring
but the slidethan
effect is smaller around.
the ont op..
number of minor contributors.
3. Ownership add some relationship
has a stronger thought bubbles to the developer
with pre-release Cataldo et al. found that making changes to a depending
failures than post-release failures.
“I need to use this...” component without coordinating with the other stakehold-
ers (in our case, the owner) of the component increases the
4. Measures of“Iownership
need to andmake
standardacode
change toshow
measures that...likelihood
which of is faults
used[9].byWethis...”
have no record of the communi-
a much smaller relationship to post-release failures in
cation between developers of Windows. However, the fact
Windows 7.
that a minor contributor has, by definition, made few if any
prior contributions to a component suggests that their par-
7. EFFECTS OF MINOR CONTRIBUTORS ticipation in the component’s implicit team is likely minimal,
One of the key findings in our analysis was that the num- increasing the risk of a introducing a bug.
ber of minor contributors has a strong relationship with fail- But does this actually happen? Is a developer D, work-
ures in both releases of Windows. Since Microsoft has the ing on binary F oo.exe, statistically more likely to be a mi-
capability to make changes to practices based on these find- nor contributor to a binary Bar.dll, just because F oo.exe
ings, we were eager to gain a deeper understanding of this depends on Bar.dll? If so, how many of the minor con-
phenomenon. To this end, we performed two more detailed tributors to components can this phenomenon account for?
analyses in order to examine the minor contributors further. If the majority of minor contributors are a result of com-
First, we observed that almost all developers were ma- ponent owners making changes do depending or dependent
jor contributors to some binaries and minor contributors to components to accomplish their own tasks such as resolving
others; very few developers never played a major contrib- failures, then deliberate steps could be taken to avoid this
utor role. This led us to investigate the obvious question: type of risky behavior.
Given a particular developer, is there a relationship between To investigate this further, we employed a static analy-
a component to which she is a major contributor, and one sis tool, MaX [33], to detect dependency relationships be-
to which she is a minor contributor? tween binaries. MaX uses debugging information files that
Second, we adapted a fault prediction study carried out are generated during compilation to identify these relation-
by Pinzger et al. [29] and examined the effect of modifying ships, which include method calls, read and writes to the
the study in ways related to ownership. registry, IPC, COM calls, and use of types. We were unable
to obtain the required debugging information files for Win-
7.1 Dependency Analysis dows 7 and thus limit our analysis here to Vista. Using this
The majority of developers that contributed to Windows tool, we constructed a dependency graph that includes all
acted as major contributors to some binaries and minor con- of the binaries in Windows Vista.
tributors to others. There were very few developers who are The next step is to determine whether the major-minor-
only minor contributors. This fact is an indication of strong dependency phenomenon occurs statistically more often than
code ownership, as it shows that nearly everyone has a main would be expected by chance. But what exactly does “by
responsibility for at least one binary. chance” mean? We model “chance” by generating a large,
Discussions with engineers at Microsoft indicated that of- plausible, random sample of contributions; we can then com-
ten an engineer who was the owner of one binary would pare the observed frequency of major-minor-dependency with
make changes to another binary whose services he or she the frequency in the generated sample. Our plausible ran-
used, often in the process of addressing reported bugs. In dom model is that each developer chooses their contribu-
our context this would show up as one engineer who was tions at random, while preserving their rate of minor and
a major contributor to some binary, A, and a minor con- major contributions. In other words, a developer is just as
tributor to some binary, B, with a dependency relationship hardworking, but her choice of where to contribute is not
between A and B. We call this a Major-Minor-Dependency influenced by dependencies in the code. Using this model,
relationship, which is illustrated in Figure 3. we generate a large sample of simulated contribution graphs.
the normally distributed frequency of this phenomenon out
of all 10,000 graphs was 32% of the time, indicating that
minor 52% is definitely a statistically significant difference, and
Amy C Amy C the phenomenon that we are observing does not occur by
r
no
i chance.
m In Vista, one common reason that a developer is a minor
m contributor to a binary is that he/she is a major contributor
in
or to a depending binary. This allows for processes to be put
minor
Bob D Bob D into place to recognize and either minimize or aid minor
contributions.
Table 3: Correlation of Social Network Analysis metrics on the contribution network with pre- and post-release failures.
Columns labeled “Minor” are correlations of failures with metrics computed on networks composed only of minor contribution
edges. Columns labeled ”Major” are from networks made up of major contribution edges. For the majaority of metrics,
removing the minor edges drops the correlations considerably. For some metrics, the direction of correlation actually changes
for “Major”.
Table 4: Performance of network based failure predictors for pre- and post-release failures for Vista and Windows 7
failures for Windows 7. Part of this may be attributable to It may not always be possible to follow these recommen-
a moderate relationship between the Minor and Owner- dations (for instance, in cases where too many potential con-
ship, but although Ownership was significant in all models tributors need changes to a component for one developer to
when removing Minor, the effect was smaller. Nonetheless, handle), however they should be followed as much as pos-
in all cases, higher values for Ownership was associated sible within reason. These recommendations are currently
with lower numbers of failures. We therefore conclude that being evaluated at Microsoft. We plan to investigate the re-
hypothesis 2 is supported in the case of Windows Vista and lationship of the ownership measures used in this paper with
in pre-release data for Windows 7. software quality in other projects at Microsoft that differ in
The results of empirical software engineering studies do size and process domain (e.g. projects utilizing agile). Fur-
not always generalize to settings where a different process ther, we plan to observe the results of projects that follow
is used. The process that is used may dictate the effect of these recommendations.
other factors on software quality as well. Therefore, when
determining the applicability of a research result to a soft-
ware project, the context of the study must be taken in
9. CONCLUSION
account. Microsoft employs strong ownership practices and We have examined the relationship between ownership
our results are much more likely to hold in other indus- and software quality in two large software development projects.
trial settings where similar policies are in place. Examining We found that high levels of ownership, specifically opera-
the effect of ownership in contexts where ownership is not tionalized as high values of Ownership and Major, and
stressed as highly, such as in many open source software low values of Minor, are associated with less defects.
(OSS) projects, is an area of continued study as we attempt An investigation into the effects of minor and major con-
to understand the interaction between ownership, quality, tributions on network based defect prediction found that
and varying software processes. removing minor contribution edges severely impaired pre-
For contexts in which strong ownership is practiced or dictive power. We also found that when a component has
where empirical studies are consistent with our own find- a minor contributor, the same developer is a major contrib-
ings, we make the following recommendations regarding the utor to a dependent component approximately half of the
development process based on our findings: time, uncovering at least one significant reason for high lev-
els of minor contributions. Changes to policies regarding
1. Changes made by minor contributors should be reviewed tasks that would lead to this behavior, such as defect reso-
with more scrutiny. Changes made by minor con- lution and feature implementation, should be implemented
tributors should be exposed to greater scrutiny than and evaluated.
changes made by developers who are experienced with For organizations where ownership has a strong relation-
the source for a particular binary. When possible, ma- ship with defects, we have presented recommendations which
jor contributors should perform these code inspections. are currently being evaluated at Microsoft. As our measures
If a major contributor cannot perform all inspections, of ownership are cheap and lightweight, we encourage other
he or she should focus on inspecting changes by minor researchers and practitioners to perform and report their
contributors. findings of similar analyses so that we can build a body of
knowledge regarding ownership and quality in various do-
2. Potential minor contributors should communicate de- mains and contexts.
sired changes to developers experienced with the respec-
tive binary. Often minor contributors to one binary
are major contributors to a depending binary. Rather 10. REFERENCES
than making a desired change directly, these develop- [1] R. Banker, G. Davis, and S. Slaughter. Software
ers should contact a major contributor and commu- development practices, software complexity, and
nicate the desired change so that it can be made by software maintenance performance: A field study.
someone who has higher levels of expertise. Management Science, 44(4):433–450, 1998.
3. Components with low ownership should be given pri- [2] V. Basili and G. Caldiera. Improve Soft-ware Quality
ority by QA resources. Metrics such as Minor and by Reusing Knowledge and Experience. Sloan
Ownership should be used in conjunction with source Management Review, 37:55–55, 1995.
code based metrics to identify those binaries with a [3] V. Basili, G. Caldiera, and H. Rombach. The Goal
high potential for having many post-release failures. Question Metric Approach. Encyclopedia of Software
When faced with limited resources for quality-control Engineering, 1:528–532, 1994.
efforts, these binaries should have priority. [4] K. Beck and C. Andres. Extreme Programming
Explained: Embrace Change. Addison-Wesley Reading, [20] R. Milo, N. Kashtan, S. Itzkovitz, M. E. J. Newman,
MA, 2005. and U. Alon. On the uniform generation of random
[5] C. Bird, N. Nagappan, P. Devanbu, H. Gall, and graphs with prescribed degree sequences. Arxiv
B. Murphy. Does distributed development affect preprint cond-mat/0312028, 2003.
software quality? an empirical case study of windows [21] A. Mockus. Succession: Measuring transfer of code
vista. In Proc. of the International Conference on and developer productivity. In Proceedings of the 31st
Software Engineering, 2009. International Conference on Software Engineering,
[6] C. Bird, N. Nagappan, P. Devanbu, H. Gall, and 2009.
B. Murphy. Putting it All Together: Using [22] A. Mockus and J. D. Herbsleb. Expertise browser: a
Socio-Technical Networks to Predict Failures. In quantitative approach to identifying expertise. In
Proceedings of the 17th International Symposium on Proc. of the 24th International Conference on
Software Reliability Engineering. IEEE Computer Software Engineering, 2002.
Society, 2009. [23] A. Mockus and D. Weiss. Predicting risk of software
[7] W. Boh, S. Slaughter, and J. Espinosa. Learning from changes. Bell Labs Technical Journal, 5(2):169–180,
experience in software development: A multilevel 2000.
analysis. Management Science, 53(8):1315–1331, 2007. [24] M. Molloy and B. Reed. A critical point for random
[8] F. Brooks. The Mythical Man-Month: Essays on graphs with a given degree sequence. Random Struct.
Software Engineering, 20th Anniversary Edition. Algorithms, 6(2-3):161–179, 1995.
Addison-Wesley, 1995. [25] N. Nagappan and T. Ball. Use of relative code churn
[9] M. Cataldo, P. Wagstrom, J. Herbsleb, and K. Carley. measures to predict system defect density. Proceedings
Identification of coordination requirements: of the 27th International Conference on Software
implications for the Design of collaboration and Engineering, pages 284–292, May 2005.
awareness tools. Proceedings of the 2006 20th [26] N. Nagappan, B. Murphy, and V. Basili. The influence
anniversary conference on Computer supported of organizational structure on software quality: an
cooperative work, pages 353–362, 2006. empirical case study. In Proc. of the 30th international
[10] B. Curtis, H. Krasner, and N. Iscoe. A field study of conference on Software engineering, 2008.
the software design process for large systems. [27] M. E. J. Newman, S. H. Strogatz, and D. J. Watts.
Communication of the ACM, 31(11):1268–1287, 1988. Random graphs with arbitrary degree distributions
[11] E. Darr, L. Argote, and D. Epple. The acquisition, and their applications. Phys. Rev. E, 64(2):026118, Jul
transfer, and depreciation of knowledge in service 2001.
organizations: Productivity in franchises. Management [28] T. Ostrand, E. Weyuker, and R. Bell. Where the bugs
Science, 41(11):1750–1762, 1995. are. In Proceedings of the ACM SIGSOFT
[12] S. Dowdy, S. Wearden, and D. Chilko. Statistics for international symposium on Software testing and
research. John Wiley & Sons, third edition, 2004. analysis, 2004.
[13] K. El Emam, S. Benlarbi, N. Goel, and S. N. Rai. The [29] M. Pinzger, N. Nagappan, and B. Murphy. Can
confounding effect of class size on the validity of developer-module networks predict failures? In
object-oriented metrics. IEEE Transactions of Proceedings of the 16th ACM SIGSOFT International
Software Engineering, 27(7):630–650, 2001. Symposium on Foundations of software engineering,
[14] S. Elbaum and J. Munson. Code churn: A measure for 2008.
estimating the impact of code change. In Proceedings [30] F. Rahman and P. Devanbu. Ownership, Experience
of the International Conference on Software and Defects: a fine-grained study of Authorship. In
Maintenance, 1998. Proceedings ICSE 2011, To appear, 2011.
[15] T. Fritz, G. Murphy, and E. Hill. Does a programmer’s [31] P. Robillard. The role of knowledge in software
activity indicate knowledge of code? In Proc. of the development. Communications of the ACM, 42(1):92,
ACM SIGSOFT symposium on The foundations of 1999.
software engineering, page 350. ACM, 2007. [32] M. Sacks. On-the-Job Learning in the Software
[16] R. Kraut and L. Streeter. Coordination in software Industry. Corporate Culture and the Acquisition of
development. Communications of the ACM, Knowledge. Quorum Books, 88 Post Road West,
38(3):69–81, 1995. Westport, CT 06881., 1994.
[17] F. W. Lancaster. Information Retrieval Systems: [33] A. Srivastava, J. Thiagarajan, and C. Schertz.
Characteristics, Testing, and Evaluation. Wiley, 2nd Efficient Integration Testing using Dependency
edition, 1979. Analysis. Technical Report MSR-TR-2005-94,
[18] D. W. McDonald and M. S. Ackerman. Expertise Microsoft Research, 2005.
recommender: a flexible recommendation system and [34] S. Wasserman and K. Faust. Social network analysis:
architecture. In Proc. of the ACM conference on Methods and applications. Cambridge University
Computer supported cooperative work, 2000. Press, 1994.
[19] A. Meneely and L. A. Williams. Secure open source [35] E. J. Weyuker, T. J. Ostrand, and R. M. Bell. Do too
collaboration: an empirical study of linus’ law. In many cooks spoil the broth? using the number of
Proceedings of the ACM Conference on Computer and developers to enhance defect prediction models.
Communications Security, 2009. Empirical Softw. Engg., 13(5):539–559, 2008.