Studying the Relationship between Logging Characteristics and the Code Quality of Platform Software

Authors -

Weiyi, Shang; Meiyappan, Nagappan and Ahmed, E. Hassan

Venue -

Empirical Software Engineering, Published Online September 08 2013

Related Tags -

Abstract -

Platform software plays an important role in speeding up the development of large scale applications. Such platforms provide functionalities and abstraction on which applications can be rapidly developed and easily deployed. Hadoop and JBoss are examples of popular open source platform software. Such platform software generate logs to assist operators in monitoring the applications that run on them. These logs capture the doubts, concerns, and needs of developers and operators of platform software. We believe that such logs can be used to better understand code quality. However, logging characteristics and their relation to quality has never been explored. In this paper, we sought to empirically study this relation through a case study on four releases of Hadoop and JBoss. Our findings show that files with logging statements have higher postrelease defect densities than those without logging statements in 7 out of 8 studied releases. Inspired by prior studies on code quality, we defined logrelated product metrics, such as the number of log lines in a file, and logrelated process metrics such as the number of changed log lines. We find that the correlations between our log-related metrics and post-release defects are as strong as their correlations with traditional process metrics, such as the number of pre-release defects, which is known to be one the metrics with the strongest correlation with post-release defects. We also find that log-related metrics can complement traditional product and process metrics resulting in up to 40% improvement in explanatory power of defect proneness. Our results show that logging characteristics provide strong indicators of defect-prone source code files. However, we note that removing logs is not the answer to better code quality. Instead, our results show that it might be the case that developers often relay their concerns about a piece of code through logs. Hence, code quality improvement efforts (e.g., testing and inspection) should focus more on the sou

Preprint -

PDF

BibTex -

@article{Shang2013,
 author = {Weiyi, Shang and Meiyappan, Nagappan and Ahmed, E. Hassan},
 keyword = {Defect Prediction, Log File Analysis},
 title = {Studying the Relationship between Logging Characteristics and the Code Quality of Platform Software},
 type = {journal},
 venue = {Empirical Software Engineering, Published Online September 08 2013}
}