Abstract
Open Source Software (OSS) project success relies on crowd contributions. When an issue arises in pull-request based systems, @-mentions are used to call on people to task; previous studies have shown that @-mentions in discussions are associated with faster issue resolution. In most projects there may be many developers who could technically handle a variety of tasks. But OSS supports dynamic teams distributed across a wide variety of social and geographic backgrounds, as well as levels of involvement. It is, then, important to know whom to call on, i.e., who can be relied or trusted with important task-related duties, and why. In this paper, we sought to understand which observable socio-technical attributes of developers can be used to build good models of them being future @-mentioned in GitHub issues and pull request discussions. We built overall and project-specific predictive models of future @-mentions, in order to capture the determinants of @-mentions in each of two hundred GitHub projects, and to understand if and how those determinants differ between projects. We found that visibility, expertise, and productivity are associated with an increase in @-mentions, while responsiveness is not, in the presence of a number of control variables. Also, we find that though project-specific differences exist, the overall model can be used for cross-project prediction, indicating its GitHub-wide utility.





Similar content being viewed by others
Notes
E.g., developers of upstream libraries rarely respond in the downstream project.
Developers were asked about communication methods, not explicitly the @-mention.
Described in Section 4.2, a reply @-mention is directed towards someone already in the discussion; a call @-mention is directed towards someone not yet in the discussion. In our data, there is indeed a very high correlation between reply @-mentions and discussion length (0.812); however, there is a relatively low correlation between call @-mentions and discussion length (0.283). As our focus is on call @-mentions, correlation between reply @-mentions and discussion length is not a threat.
PyGithub did not handle properly some Null responses from GitHub’s API.
Note that pull requests are a subset of issues.
Though we do use outdegree in our model as well.
E.g., standard algorithms require a full adjacency matrix to be in memory at once; memory will be exhausted for networks of our size.
This measure is originally called d by Bluthgen et al., but we will use δ here to reserve d to represent developers.
We do not use \(\mathcal {MAF}\), we use an analogous form for our social networks.
We use issues fixed before closing as proxy for bugs; a higher value need not imply lack of aptitude, but it indicates a change in expected coding behavior and expertise.
\(\mathcal {ISS_{\kappa }}\) is not used for the zero component; it is undefined when call mentions are 0.
We could not perform this in-depth study for discussions not in English.
References
Ackerman AF, Fowler PJ, Ebenau RG (1984) Software inspections and the industrial production of software. In: Proceedings of a symposium on Software validation: inspection-testing-verification-alternatives. Elsevier North-Holland, Inc, pp 13–40
Allison P (2012) When can you safely ignore multicollinearity? https://statisticalhorizons.com/multicollinearity
Bandura A (1973) Aggression: A social learning analysis. Prentice-Hall
Bandura A, Walters RH (1977) Social learning theory
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the royal statistical society. Series B (Methodological) pp 289–300
Bird C, Gourley A, Devanbu P, Swaminathan A, Hsu G (2007) Open borders? immigration in open source projects. In: The Fourth international workshop on mining software repositories
Bird C, Rigby PC, Barr ET, Hamilton DJ, German DM, Devanbu P (2009) The promises and perils of mining git. In: 6th IEEE international working conference on mining software repositories, 2009. MSR’09, pp 1–10. IEEE
Blüthgen N, Menzel F, Blüthgen N (2006) Measuring specialization in species interaction networks. BMC Ecol 6(1):9
Brenkert GG (1998) Trust, business and business ethics: an introduction. Bus Ethics Q 8(2):195–203
Brockner J (1996) Understanding the interaction between procedural and distributive justice: The role of trust
Burke M, Marlow C, Lento T (2009) Feed me: motivating newcomer contribution in social network sites. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 945–954. ACM
Burke M, Marlow C, Lento T (2010) Social network activity and social well-being. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 1909–1912. ACM
Calefato F, Lanubile F, Novielli N (2017) A preliminary analysis on the effects of propensity to trust in distributed software development. In: 2017 IEEE 12th international conference on global software engineering (ICGSE), pp 56–60. IEEE
Cameron AC, Trivedi PK (2013) Regression analysis of count data, vol 53. Cambridge University Press, Cambridge
Casalnuovo C, Vasilescu B, Devanbu P, Filkov V (2015) Developer onboarding in github: the role of prior social links and language experience. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp 817–828. ACM
Chow SC, Shao J, Wang H, Lokhnygina Y (2017) Sample size calculations in clinical research. Chapman and Hall/CRC
Cohen J (1988) Statistical power analysis for the behavioural sciences
Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression/correlation analysis for the behavioral sciences. Routledge
da Costa DA, McIntosh S, Shang W, Kulesza U, Coelho R, Hassan AE (2017) A framework for evaluating the results of the szz approach for identifying bug-introducing changes. IEEE Trans Softw Eng 43(7):641–657
Dabbish L, Stuart C, Tsay J, Herbsleb J (2012) Social coding in github: transparency and collaboration in an open software repository. In: Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pp 1277–1286. ACM
Dourish P, Chalmers M (1994) Running out of space: Models of information navigation. In: Short paper presented at HCI, vol 94, pp 23–26
Ducheneaut N (2005) Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW) 14(4):323–368
Faraway JJ (2014) Linear models with R CRC press
Gallivan MJ (2001) Striking a balance between trust and control in a virtual organization: a content analysis of open source software case studies. Inf Syst J 11 (4):277–304
Gharehyazie M, Posnett D, Filkov V (2013) Social activities rival patch submission for prediction of developer initiation in oss projects. In: 2013 29th IEEE international conference on software maintenance (ICSM), pp 340–349. IEEE
Gharehyazie M, Posnett D, Vasilescu B, Filkov V (2015) Developer initiation and social interactions in oss: A case study of the apache software foundation. Empir Softw Eng 20(5):1318–1353
Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40(3-4):237–264
Handy C (1995) Trust and the virtual organization. Harv Bus Rev 73(3):40–51
Hossain L, Zhu D (2009) Social networks and coordination performance of distributed software development teams. J High Technol Managem Res 20(1):52–61
Husted BW (1998) The ethical limits of trust in business relations. Bus Ethics Q 8(2):233–248
Ibrahim WM, Bettenburg N, Shihab E, Adams B, Hassan AE (2010) Should i contribute to this discussion?. In: 2010 7th IEEE working conference on mining software repositories (MSR), pp 181–190. IEEE
Inglehan R (1999) Trust, well-being and democracy. Democracy and trust pp 88
Jarvenpaa SL, Knoll K, Leidner DE (1998) Is anybody out there? antecedents of trust in global virtual teams. J Manag Inf Syst 14(4):29–64
Jones TM, Bowie NE (1998) Moral hazards on the road to the “virtual” corporation. Bus Ethics Q 8(2):273–292
Kalliamvakou E, Damian D, Blincoe K, Singer L, German DM (2015) Open source-style collaborative development practices in commercial projects using github. In: Proceedings of the 37th international conference on software engineering-volume 1, pp 574–585. IEEE Press
Kavaler D, Sirovica S, Hellendoorn V, Aranovich R, Filkov V (2017) Perceived language complexity in github issue discussions and their effect on issue resolution. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, pp 72–83. IEEE Press
Kim S, Zimmermann T, Pan K, James Jr E, et al. (2006) Automatic identification of bug-introducing changes. In: ASE’06. 21st IEEE/ACM international conference on automated software engineering, 2006, pp 81–90. IEEE
Kramer RM, Tyler TR (1996) Trust in organizations: Frontiers of theory and research. Sage
Lee MJ, Ferwerda B, Choi J, Hahn J, Moon JY, Kim J (2013) Github developers use rockstars to overcome overflow of news. In: CHI’13 extended abstracts on human factors in computing systems, pp 133–138. ACM
Matter D, Kuhn A, Nierstrasz O (2009) Assigning bug reports using a vocabulary-based expertise model of developers. In: MSR’09. 6th IEEE international working conference on mining software repositories, 2009, pp 131–140. IEEE
McDonald N, Goggins S (2013) Performance and participation in open source software on github. In: CHI’13 extended abstracts on human factors in computing systems, pp 139–144. ACM
McKnight DH, Choudhury V, Kacmar C (2002) Developing and validating trust measures for e-commerce: An integrative typology. Inf Syst Res 13(3):334–359
Mockus A, Herbsleb JD (2002) Expertise browser: a quantitative approach to identifying expertise. In: Proceedings of the 24rd international conference on software engineering, 2002. ICSE 2002, pp 503–512. IEEE
Murphy G, Cubranic D (2004) Automatic bug triage using text categorization. In: Proceedings of the 16th international conference on software engineering & knowledge engineering. Citeseer
Newton K (2001) Trust, social capital, civil society, and democracy. Int Polit Sci Rev 22(2):201–214
Oeldorf-Hirsch A, Sundar SS (2015) Posting, commenting, and tagging: Effects of sharing news stories on facebook. Comput Hum Behav 44:240–249
O’Leary M, Orlikowski W, Yates J (2002) Distributed work over the centuries: Trust and control in the hudson’s bay company, 1670-1826. Distributed work, pp 27–54
Posnett D, D’Souza R, Devanbu P, Filkov V (2013) Dual ecological measures of focus in software development. In: Proceedings of the 2013 international conference on software engineering, pp 452–461. IEEE Press
Qiu L, Lin H, Leung AKY (2013) Cultural differences and switching of in-group sharing behavior between an american (facebook) and a chinese (renren) social networking site. J Cross-Cult Psychol 44(1):106–121
Robert LP, Denis AR, Hung YTC (2009) Individual swift trust and knowledge-based trust in face-to-face and virtual team members. J Manag Inf Syst 26 (2):241–279
Rodrıguez G (2013) Models for count data with overdispersion
Rodríguez-Pérez G, Zaidman A, Serebrenik A, Robles G, González-Barahona JM (2018) What if a bug has a different origin? making sense of bugs without an explicit bug introducing change. In: Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement, p 52. ACM
Saavedra R, Earley PC, Van Dyne L (1993) Complex interdependence in task-performing groups. J Appl Psychol 78(1):61
Sato Y, Arita S (2004) Impact of globalization on social mobility in Japan and korea: Focusing on middle classes in fluid societies. Int J Jpn Sociol 13(1):36–52
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: ACM sigsoft software engineering notes, vol 30, pp 1–5. ACM
Steinmacher I, Conte T, Gerosa MA, Redmiles D (2015) Social barriers faced by newcomers placing their first contribution in open source software projects. In: Proceedings of the 18th ACM conference on Computer supported cooperative work & social computing, pp 1379–1392. ACM
Steinmacher I, Conte TU, Treude C, Gerosa MA (2016) Overcoming open source project entry barriers with a portal for newcomers. In: International conference on software engineering
Stolcke A, Ries K, Coccaro N, Shriberg E, Bates R, Jurafsky D, Taylor P, Martin R, Van Ess-Dykema C, Meteer M (2000) Dialogue act modeling for automatic tagging and recognition of conversational speech. Comput Linguist 26 (3):339–373
Tsay J, Dabbish L, Herbsleb J (2014) Let’s talk about it: evaluating contributions through discussion in github. In: Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering, pp 144–154. ACM
Vandekerckhove J, Matzke D, Wagenmakers EJ (2015) Model comparison and the principle. In: The Oxford handbook of computational and mathematical psychology, vol 300. Oxford Library of Psychology
Vuong QH (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society pp 307–333
Yu Y, Wang H, Yin G, Wang T (2016) Reviewer recommendation for pull-requests in github: What can we learn from code review and bug assignment? Inf Softw Technol 74:204–218
Yu Y, Yin G, Wang H, Wang T (2014) Exploring the patterns of social behavior in github. In: Proceedings of the 1st international workshop on crowd-based software development methods and technologies, pp 31–36. ACM
Zhang Y, Wang H, Yin G, Wang T, Yu Y (2015) Exploring the use of@-mention to assist software development in github. In: Proceedings of the 7th Asia-pacific symposium on internetware, pp 83–92. ACM
Zhang Y, Wang H, Yin G, Wang T, Yu Y (2017) Social media in github: the role of @-mention in assisting software development. Science China Information Sciences 60(3):032102
Zucker LG (1986) Production of trust: Institutional sources of economic structure, 1840–1920. Research in organizational behavior
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Filippo Lanubile
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kavaler, D., Devanbu, P. & Filkov, V. Whom are you going to call? determinants of @-mentions in Github discussions. Empir Software Eng 24, 3904–3932 (2019). https://doi.org/10.1007/s10664-019-09728-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-019-09728-3