TY - GEN
T1 - Using blind analysis for software engineering experiments
AU - Sigweni, Boyce
AU - Shepperd, Martin
N1 - Publisher Copyright:
Copyright 2015 ACM.
PY - 2015/4/27
Y1 - 2015/4/27
N2 - Context: In recent years there has been growing concern about conflicting experimental results in empirical software engineering. This has been paralleled by awareness of how bias can impact research results. Objective: To explore the practicalities of blind analysis of experimental results to reduce bias. Method: We apply blind analysis to a real software engineering experiment that compares three feature weighting approaches with a naïve benchmark (sample mean) to the Finnish software effort data set. We use this experiment as an example to explore blind analysis as a method to reduce researcher bias. Results: Our experience shows that blinding can be a relatively straightforward procedure. We also highlight various statistical analysis decisions which ought not be guided by the hunt for statistical significance and show that results can be inverted merely through a seemingly inconsequential statistical nicety (i.e., the degree of trimming). Conclusion: Whilst there are minor challenges and some limits to the degree of blinding possible, blind analysis is a very practical and easy to implement method that supports more objective analysis of experimental results. Therefore we argue that blind analysis should be the norm for analysing software engineering experiments.
AB - Context: In recent years there has been growing concern about conflicting experimental results in empirical software engineering. This has been paralleled by awareness of how bias can impact research results. Objective: To explore the practicalities of blind analysis of experimental results to reduce bias. Method: We apply blind analysis to a real software engineering experiment that compares three feature weighting approaches with a naïve benchmark (sample mean) to the Finnish software effort data set. We use this experiment as an example to explore blind analysis as a method to reduce researcher bias. Results: Our experience shows that blinding can be a relatively straightforward procedure. We also highlight various statistical analysis decisions which ought not be guided by the hunt for statistical significance and show that results can be inverted merely through a seemingly inconsequential statistical nicety (i.e., the degree of trimming). Conclusion: Whilst there are minor challenges and some limits to the degree of blinding possible, blind analysis is a very practical and easy to implement method that supports more objective analysis of experimental results. Therefore we argue that blind analysis should be the norm for analysing software engineering experiments.
UR - http://www.scopus.com/inward/record.url?scp=84961153212&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961153212&partnerID=8YFLogxK
U2 - 10.1145/2745802.2745832
DO - 10.1145/2745802.2745832
M3 - Conference contribution
AN - SCOPUS:84961153212
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering, EASE 2015
PB - Association for Computing Machinery
T2 - 19th International Conference on Evaluation and Assessment in Software Engineering, EASE 2015
Y2 - 27 April 2015 through 29 April 2015
ER -