The applicability of case-based reasoning to software cost estimation.
The nature and competitiveness of the modern software development industry demands that software engineers be able to make accurate and consistent software cost estimates. Traditionally software cost estimates have been derived with algorithmic cost estimation models such as COCOMO and Function Point Analysis. However, researchers have shown that existing software cost estimation techniques fail to produce accurate and consistent software cost estimates. Improving the reliability of software cost estimates would facilitate cost savings, improved delivery time and better quality software developments. To this end, considerable research has been conducted into finding alternative software cost estimation models that are able produce better quality software cost estimates. Researchers have suggested a number of alternative models to this problem area. One of the most promising alternatives is Case-Based Reasoning (CBR), which is a machine learning paradigm that makes use of past experiences to solve new problems. CBR has been proposed as a solution since it is highly suited to weak theory domains, where the relationships between cause and effect are not well understood. The aim of this research was to determine the applicability of CBR to software cost estimation. This was accomplished in part through the thorough investigation of the theoretical and practical background to CBR, software cost estimation and current research on CBR applied to software cost estimation. This provided a foundation for the development of experimental CBR software cost estimation models with which an empirical evaluation of this technology applied to software cost estimation was performed. In addition, several regression models were developed, against which the effectiveness of the CBR system could be evaluated. The architecture of the CBR models developed, facilitated the investigation of the effects of case granularity on the quality of the results obtained from them. Traditionally researchers into this field have made use of poorly populated datasets, which did not accurately reflect the true nature of the software development industry. However, for the purposes of this research an extensive database of 300 software development projects was obtained on which these experiments were performed. The results obtained through experimentation indicated that the CBR models that were developed, performed similarly and in some cases better than those developed by other researchers. In terms of the quality of the results produced, the best CBR model was able to significantly outperform the estimates produced by the best regression model. Also, the effects of increased case granularity was shown to result in better quality predictions made by the CBR models. These promising results experimentally validated CBR as an applicable software cost estimation technique. In addition, it was shown that CBR has a number of methodological advantages over traditional cost estimation techniques.