Online Program

Return to main conference page

All Times ET

Friday, June 4
Practice and Applications
Data Science Shaping Innovative Applications
Fri, Jun 4, 11:25 AM - 1:00 PM
TBD
 

Community Detection in Open Source Software Collaboration Networks (309689)

J. Bayoan Santiago Calderon, University of Virginia 
Gizem Korkmaz, University of Virginia 
Brandon L Kramer, University of Virginia 
*Behnaz Moradijamei, University of Virginia 

Keywords: Open Source Software, Community Detection, OSS Community Structure

In this paper, we identify and study the communities formed on OSS collaboration networks using a dataset of 3.26 million GitHub users. While most existing work examines how small-scale OSS projects emerge, our work draws on a large-scale network of contributors from GitHub - the world’s largest remote hosting platform. Moreover, OSS collaborations are characterized by small groups of users that collaborate closely together and thus form more short cycles of collaboration within a community than across communities. To better understand how communities are shaped by the cyclic structure of the network (rather than just existing edges of a graph), we introduce a novel method for detecting communities: we incorporate a blend of this property as well as the strengths of the collaboration among users, as a preprocessing step and feed further topological information about the participation of edges in the cyclic structure of the groups to our clustering methods. To do this, we first preprocess the network data using Renewal-Nonbacktacking Random Walk (RNBRW) and then apply state-of-the-art clustering methods such as Louvain (Blondel et al., 2008) and Clauset-Newman-Moore (CNM). This method provides a stronger approach for detecting small-scale team formation by accounting for preferential attachment to more established collaboration communities. This paper offers useful insights for both open-source software experts as well as network scholars interested in studying group formation.