Otago University Research Archive

A comparison of software effort prediction models using small datasets

Otago University Research Archive

Show simple item record


dc.contributor.author van Koten, Chikako en_NZ
dc.date.copyright 2007-05 en_NZ
dc.identifier.citation van Koten, C. (2007, May). A comparison of software effort prediction models using small datasets. University of Otago. en
dc.identifier.uri http://hdl.handle.net/10523/1461
dc.description Submitted to IEEE Transactions on Software Engineering. If published, this version will be replaced by the final version. en_NZ
dc.description.abstract Constructing an accurate effort prediction model is a challenge in Software Engineering. One difficulty practitioners often experience is that they have only a very small amount of local data to construct a model. The small dataset limits predictive accuracy of the model, since the accuracy deteriorates as the size of the dataset decreases. This paper compares three different software development effort prediction models that are applicable to these small datasets. They are: (1) Bayesian statistical models, (2) multiple linear regression models and (3) case-based reasoning/analogy-based models. The predictive accuracy of these models is evaluated using two different software datasets. The results have shown that the accuracy of the Bayesian statistical models is higher than or competitive with that of the others, when calibrated using data collected from fewer than 10 systems. These suggest that the Bayesian statistical model would be a better choice in effort prediction when the practitioners have only a very small dataset, consisting of fewer than 10 systems similar to their system of interest. en_NZ
dc.format.mimetype application/pdf
dc.publisher University of Otago en_NZ
dc.subject multivariate statistics en_NZ
dc.subject modeling methodologies en_NZ
dc.subject management techniques en_NZ
dc.subject statistical methods en_NZ
dc.subject cost estimation en_NZ
dc.subject time estimation en_NZ
dc.subject.lcsh QA75 Electronic computers. Computer science en_NZ
dc.subject.lcsh QA76 Computer software en_NZ
dc.title A comparison of software effort prediction models using small datasets en_NZ
dc.type Other Type en_NZ
dc.description.version Submitted en_NZ
otago.bitstream.pages 41 en_NZ
otago.date.accession 2007-05-08 en_NZ
otago.school Information Science en_NZ
otago.openaccess Open
otago.place.publication Dunedin, New Zealand en_NZ
dc.identifier.eprints 688 en_NZ
otago.school.eprints Software Engineering & Collaborative Modelling Laboratory en_NZ
otago.school.eprints Information Science en_NZ
dc.description.references [1] C.J. Burgess and M. Lefley. Can genetic programming improve software effort estimation? a comparative evaluation. Information and Software Technology, 43:863–873, 2001. [2] P. Congdon. Bayesian Statistical Modelling. John Wiley & Sons., 2001. [3] S.D. Conte, H.E. Dunsmore, and V.Y. Shen. Software Engineering Metrics and Models. Benjamin/Cummings Publishing Company, 1986. [4] N.E. Fenton and S.L. Pfleeger. Software Metrics:A Rigorous & Practical Approach. PWS Publishing Company, second edition, 1997. [5] T. Foss, E. Stensrud, B. Kitchenham, and I. Myrtveit. A simulation study of the model evaluation criterion mmre. IEEE Transactions on Software Engineering, 29(11):985–995, 2003. [6] R.L. Glass. Frequently forgotten fundamental facts about software engineering. IEEE Software, May/June:110–112, 2001. [7] A.R. Gray. A simulation-based comparison of empirical modeling techniques for software metric models of development effort. In Proceedings of the 6th International Conference on Neural Information Processing, pages 526–531, 1999. [8] A.R. Gray and S.G. MacDonell. A comparison of alternatives to regression analysis as model building techniques to develop predictive equations for software metrics. Information and Software Technology, 39:425–437, 1997. [9] A.R. Gray and S.G. MacDonell. Software metrics data analysis- exploring the relative performance of some commonly used modeling technique. Empirical Software Engineering, 4:297–316, 1999. [10] P.J. Green. A primer on markov chain monte carlo. In O.E. Barndorff-Nielsen, D.R. Cox, and C. Kl ̈uppelberg, editors, Complex Stochastic Systems, chapter 1, pages 1–62. Chapman & Hall/CRC, 2001. [11] F.J. Heemstra. Software cost estimation. Information and Software Technology, 34(10):627–639, 1992. [12] R. Jeffery, M. Ruhe, and I. Wieczorek. A comparative study of two software development cost modeling techniques using multi-organizational and company-specific data. Information and Software Technology, 42:1009–1016, 2000. [13] M. Jørgensen. Experience with the accuracy of software maintenance task effort prediction models. IEEE Transactions on Software Engineering, 21(8):674–681, 1995. [14] B. Kitchenham, E. Mendes, and G.H. Travassos. A systematic review of cross- vs. within-company cost estimation studies. In Proceedings of the 10th International Conference on Evaluation and Assessment in Software Engineering (EASE2006), 2006. [15] B.A. Kitchenham. Empirical studies of assumptions that underlie software cost-estimation models. Information and Software Technology, 34(4):211–218, 1992. [16] B.A. Kitchenham, L.M. Pickard, S.G. MacDonell, and M.J. Shepperd. What accuracy statistics really measure. IEE Proceedings–Software, 148(3):81–85, 2001. [17] H. Lee. A structured methodology for software development effort prediction using the analytic hierarchy process. Journal of Systems Software, 21:179–186, 1993. [18] S.G. MacDonell. Establishing relationships between specification size and software process effort in CASE environment. Information and Software Technology, 39:35–45, 1997. [19] S.G. MacDonell and A.R. Gray. Alternatives to regression models for estimating software projects. In Proceedings of the IFPUG Fall Conference, 1996. [20] S.G. MacDonell and A.R. Gray. A comparison of modeling techniques for software development effort prediction. In Proceedings of the 1997 International Conference on Neural Information Processing and Intelligent Information Systems, pages 869–872, 1997. [21] S.G. MacDonell and M.J. Shepperd. Combining techniques to optimize effort predictions in software project management. The Journal of Systems and Software, 66:91–98, 2003. [22] C. Mair, G. Kadoda, M. Lefley, K. Phalp, C. Schofield, M. Shepperd, and S. Webster. An investigation of machine learning based prediction systems. The Journal of Systems and Software, 53:23–29, 2000. [23] K. Maxwell, L. Van Wassenhove, and S. Dutta. Performance evaluation of general and company specific models in software development effort estimation. Management Science, 45(6):787–803, 1999. [24] E. Mendes, C. Lokan, R. Harrison, and C. Triggs. A replicated comparison of cross-company and within-company effort estimation models using the isbsg database. In Proceedings of the 11th International Symposium on Software Metrics (METRICS’05), 2005. [25] C.S. Murali and C.S. Sankar. Issues in estimating real-time data communications software projects. Information and Software Technology, 39:399–402, 1997. [26] L. Pickard, B.A. Kitchenham, and S. Linkman. An investigation of analysis techniques for software datasets. In Proceedings of the 6th International Software Metrics Symposium (METRICS’99), pages 130–142, 1999. [27] J. Sayyad Shirabad and T.J. Menzies. The PROMISE repository of software engineering databases. School of Information Technology and Engineering, University of Ottawa, Canada, 2005. http://promise.site.uottawa.ca/SERepository. [28] M. Shepperd and G. Kadoda. Comparing software prediction techniques using simulation. IEEE Transactions on Software Engineering, 27(11):1014–1022, 2001. [29] M. Shepperd and C. Schofield. Estimating software project effort using analogy. IEEE Transactions on Software Engineering, 23(12):736–743, 1997. [30] K. Srinivasan and D. Fisher. Machine learning approaches to estimating software development effort. IEEE Transactions on Software Engineering, 21(2):126–136, 1995. [31] E. Stensrud. Alternative approaches to effort prediction of erp projects. Information and Software Technology, 43:413–423, 2001. [32] E. Stensrud, T. Foss, B.A. Kitchenham, and I. Myrtveit. An empirical validation of the relationship between the magnitude of relative error and project size. In Proceedings of the 8th IEEE Symposium on Software Metrics (METRICS’02), pages 3–12, 2002. [33] C. van Koten. Bayesian statistical models for predicting software development effort. The Journal of Systems and Software, Submitted in 2006. [34] C. van Koten and A.R. Gray. Bayesian statistical effort prediction models for data-centred 4gl software development. Information and Software Technology, 48:1056–1067, 2006. en_NZ

Full-text options 

This item appears in the following Collection(s)

Show simple item record