Abstract
Cost estimation and effort allocation are the key challenges for successful project planning and management in software development. Therefore, both industry and the research community have been working on various models and techniques to accurately predict the cost of projects. Recently, researchers have started debating whether the prediction performance depends on the structure of data rather than the models used. In this article, we focus on a new aspect of data homogeneity, "cross- versus within-application domain", and investigate what kind of training data should be used for software cost estimation in the embedded systems domain. In addition, we try to find out the effect of training dataset size on the prediction performance. Based on our empirical results, we conclude that it is better to use cross-domain data for embedded software cost estimation and the optimum training data size depends on the method used.
Original language | English |
---|---|
Pages (from-to) | 57-80 |
Number of pages | 24 |
Journal | Software Quality Journal |
Volume | 18 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1 Jan 2009 |
Externally published | Yes |
Keywords
- Application domain
- Cost estimation
- Data homogeneity
- Embedded software
- Machine learning