Abstract
In the age of big data, data science is an essential skill that should be equipped by software engineers. It can be used to predict useful
information on new projects based on completed projects. This tutorial reflects on the state-of-the-art in this important field.
Before data mining, this tutorial discusses the tasks needed to deploy data mining algorithms to organizations including how to determine the information needs of particular managers.
During data mining, this tutorial discusses the following: (a) when studying particular organizations, how to use surveys and interviews to guide data analysis; (b) when local data is scarce, we show how to adapt data from other organizations to local problems; (c) when working with data of dubious quality, we show how to prune spurious information; (d) when data or models seem
too complex, we show how to simplify data mining results; (e) when the world changes, and old models need to be updated, we show how to handle those updates; (f) When the effect is too complex for one model, we show to reason over ensembles.
Target audience: Software practitioners and researchers wanting to understand the state of the art in using data mining for
software engineering (SE) data.
Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developersand technical managers.
information on new projects based on completed projects. This tutorial reflects on the state-of-the-art in this important field.
Before data mining, this tutorial discusses the tasks needed to deploy data mining algorithms to organizations including how to determine the information needs of particular managers.
During data mining, this tutorial discusses the following: (a) when studying particular organizations, how to use surveys and interviews to guide data analysis; (b) when local data is scarce, we show how to adapt data from other organizations to local problems; (c) when working with data of dubious quality, we show how to prune spurious information; (d) when data or models seem
too complex, we show how to simplify data mining results; (e) when the world changes, and old models need to be updated, we show how to handle those updates; (f) When the effect is too complex for one model, we show to reason over ensembles.
Target audience: Software practitioners and researchers wanting to understand the state of the art in using data mining for
software engineering (SE) data.
Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developersand technical managers.
Original language | English |
---|---|
Title of host publication | ICSE’14 Tutorial Briefing (proposed length: half day) TUT: The Art and Science of Analyzing Software Data |
Subtitle of host publication | May 31 – June 7, 2014, Hyderabad, India |
Editors | Pankaj Jalote, Lionel Briand, André van der Hoek |
Place of Publication | New York NY USA |
Publisher | Association for Computing Machinery (ACM) |
Number of pages | 4 |
ISBN (Electronic) | 9781450327565 |
Publication status | Published - 2014 |
Externally published | Yes |
Event | International Conference on Software Engineering 2014 - Hyderabad, India Duration: 31 May 2014 → 7 Jun 2014 Conference number: 36th http://2014.icse-conferences.org/ |
Conference
Conference | International Conference on Software Engineering 2014 |
---|---|
Abbreviated title | ICSE 2014 |
Country/Territory | India |
City | Hyderabad |
Period | 31/05/14 → 7/06/14 |
Internet address |