A tale of bug prediction in software development

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
 3
 
  1. A Tale of Bug Prediction in Software Development Martin Pinzger Professor of Software Engineering University of Klagenfurt, Austria Follow me: @pinzger 2. Software…
Related documents
Share
Transcript
  • 1. A Tale of Bug Prediction in Software Development Martin Pinzger Professor of Software Engineering University of Klagenfurt, Austria Follow me: @pinzger
  • 2. Software repositories 2
  • 3. Hmm, wait a minute 3 Can’t we learn “something” from that data?
  • 4. Goal of software repository mining Software Analytics To obtain insightful and actionable information for completing various tasks around developing and maintaining software systems Examples Quality analysis and defect prediction Detecting “hot-spots” Recommender (advisory) systems Code completion Suggesting good code examples Helping in using an API ... 4
  • 5. Examples from my mining research Predicting failure-prone source files using changes (MSR 2011) Predicting failure-prone methods (ESEM 2012) The relationship between developer contributions and failure-prone Microsoft Vista binaries (FSE 2008) ! Surveys on software repository mining A survey and taxonomy of approaches for mining software repositories in the context of software evolution, Kagdi et al. 2007 Evaluating defect prediction approaches: a benchmark and an extensive comparison, D’Ambros et al. 2012 Conference: MSR 2015 http://2015.msrconf.org/ 5
  • 6. Using Fine-Grained Source Code Changes for Bug Prediction with Emanuel Giger, Harald Gall University of Zurich
  • 7. Bug prediction Goal Train models to predict the bug-prone source files of the next release How Using product measures, process measures, organizational measures with machine learning techniques 7
  • 8. Many existing studies A comparative analysis of the efficiency of change metrics and static code attributes for defect prediction, Moser et al. 2008 Use of relative code churn measures to predict system defect density, Nagappan et al. 2005 Cross-project defect prediction: a large scale experiment on data vs. domain vs. process, Zimmermann et al. 2009 Predicting faults using the complexity of code changes, Hassan et al. 2009 8
  • 9. Classical change measures Number of file revisions Code Churn aka lines added/deleted/changed ! Can we improve existing prediction models? 9
  • 10. Revisions are coarse grained What did change in a revision? 10
  • 11. Code Churn can be imprecise 11 Extra changes not relevant for locating bugs
  • 12. Fine Grained-Source Code Changes (SCC) Account.java 1.5 THEN MI IF "balance > 0" "withDraw(amount);" THEN MI Account.java 1.6 "balance > 0 && amount <= balance" IF "withDraw(amount);" ELSE MI notify(); 3 SCC: 1x condition change, 1x else-part insert, 1x invocation statement insert 12
  • 13. Categories of SCC cDecl = changes to class declarations oState = insertion and deletion of class attributes func = insertion and deletion of methods mDecl = changes to method declarations stmt = insertion or deletion of executable statements cond = changes to conditional expressions else = insertion and deletion of else-parts 13
  • 14. Research hypotheses 14 H1 SCC is correlated with the number of bugs in source files H2 SCC is a predictor for bug-prone source files (and outperforms Code Churn) H3 SCC is a predictor for the number of bugs in source files (and outperforms Code Churn)
  • 15. 15 Eclipse plug-ins Data >850’000 fine-grained source code changes (SCC) >10’000 files >9’700’000 lines modified (LM = Code Churn) >9 years of development history ..... and a lot of bugs referenced in commit messages (e.g., bug #345) 15
  • 16. Typical experimental set-up 1. Analyze quality and distribution of the data Use descriptive statistics, histograms, Q-Q plots -> Determines the statistical methods that you can use 2. Perform correlation analysis Spearman (non-parametric) 3. Machine learners/classifiers Simple ones first (binary logistic regression, linear regression, decision trees) 10-fold cross validation, precision, recall, AUC ROC 4. Interpretation and discussion of results (incl. threats to validity) 16
  • 17. (LM), (2) bug data, i.e., which files contained bugs and how many of them (Bugs), and (3) fine-grained source code changes (SCC). Approach overview 2. Bug Data 3. 1.Versioning Data Source Code Changes (SCC) 4. Experiment CVS, SVN, GIT Evolizer RHDB Log Entries ChangeDistiller Subsequent Versions Changes #bug123 Message Bug Support Vector Machine AST Comparison 1.1 1.2 Figure 1: Stepwise overview of the data extraction process. 17
  • 18. Table 3: Relative frequencies of SCC categories per Eclipse project, Frequency plus of their change mean type and categories variance over all selected projects. Eclipse Project cDecl oState func mDecl stmt cond else Compare 0.01 0.06 0.08 0.05 0.74 0.03 0.03 jFace 0.02 0.04 0.08 0.11 0.70 0.02 0.03 JDT Debug 0.02 0.06 0.08 0.10 0.70 0.02 0.02 Resource 0.01 0.04 0.02 0.11 0.77 0.03 0.02 Runtime 0.01 0.05 0.07 0.10 0.73 0.03 0.01 Team Core 0.05 0.04 0.13 0.17 0.57 0.02 0.02 CVS Core 0.01 0.04 0.10 0.07 0.73 0.02 0.03 Debug Core 0.04 0.07 0.02 0.13 0.69 0.02 0.03 jFace Text 0.04 0.03 0.06 0.11 0.70 0.03 0.03 Update Core 0.02 0.04 0.07 0.09 0.74 0.02 0.02 Debug UI 0.02 0.06 0.09 0.07 0.70 0.03 0.03 JDT Debug UI 0.01 0.07 0.07 0.05 0.75 0.02 0.03 Help 0.02 0.05 0.08 0.07 0.73 0.02 0.03 JDT Core 0.00 0.03 0.03 0.05 0.80 0.05 0.04 OSGI 0.03 0.04 0.06 0.11 0.71 0.03 0.02 Mean 0.02 0.05 0.07 0.09 0.72 0.03 0.03 Variance 0.000 0.000 0.001 0.001 0.003 0.000 0.000 18
  • 19. Non parametric Spearman rank correlation of and H1: SCC SCC . is * correlated marks significant with #bugs correlations at Larger values are printed bold. Eclipse Project LM SCC Compare 0.68
  • Related Search
    Similar documents
    View more
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks