Data science for software engineering

Tim Menzies, Ekrem Kocaguneli, Fayola Peters, Burak Turhan, Leandro L. Minku

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.

LanguageEnglish
Title of host publication2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings
Pages1484-1486
Number of pages3
DOIs
Publication statusPublished - 30 Oct 2013
Externally publishedYes
Event2013 35th International Conference on Software Engineering, ICSE 2013 - San Francisco, CA, United States
Duration: 18 May 201326 May 2013

Conference

Conference2013 35th International Conference on Software Engineering, ICSE 2013
CountryUnited States
CitySan Francisco, CA
Period18/05/1326/05/13

Cite this

Menzies, T., Kocaguneli, E., Peters, F., Turhan, B., & Minku, L. L. (2013). Data science for software engineering. In 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings (pp. 1484-1486). [6606752] https://doi.org/10.1109/ICSE.2013.6606752
Menzies, Tim ; Kocaguneli, Ekrem ; Peters, Fayola ; Turhan, Burak ; Minku, Leandro L. / Data science for software engineering. 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings. 2013. pp. 1484-1486
@inproceedings{0c080c1eb9ef4f66a4ba0762398bd170,
title = "Data science for software engineering",
abstract = "Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.",
author = "Tim Menzies and Ekrem Kocaguneli and Fayola Peters and Burak Turhan and Minku, {Leandro L.}",
year = "2013",
month = "10",
day = "30",
doi = "10.1109/ICSE.2013.6606752",
language = "English",
isbn = "9781467330763",
pages = "1484--1486",
booktitle = "2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings",

}

Menzies, T, Kocaguneli, E, Peters, F, Turhan, B & Minku, LL 2013, Data science for software engineering. in 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings., 6606752, pp. 1484-1486, 2013 35th International Conference on Software Engineering, ICSE 2013, San Francisco, CA, United States, 18/05/13. https://doi.org/10.1109/ICSE.2013.6606752

Data science for software engineering. / Menzies, Tim; Kocaguneli, Ekrem; Peters, Fayola; Turhan, Burak; Minku, Leandro L.

2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings. 2013. p. 1484-1486 6606752.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - Data science for software engineering

AU - Menzies, Tim

AU - Kocaguneli, Ekrem

AU - Peters, Fayola

AU - Turhan, Burak

AU - Minku, Leandro L.

PY - 2013/10/30

Y1 - 2013/10/30

N2 - Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.

AB - Target audience: Software practitioners and researchers wanting to understand the state of the art in using data science for software engineering (SE). Content: In the age of big data, data science (the knowledge of deriving meaningful outcomes from data) is an essential skill that should be equipped by software engineers. It can be used to predict useful information on new projects based on completed projects. This tutorial offers core insights about the state-of-the-art in this important field. What participants will learn: Before data science: this tutorial discusses the tasks needed to deploy machine-learning algorithms to organizations (Part 1: Organization Issues). During data science: from discretization to clustering to dichotomization and statistical analysis. And the rest: When local data is scarce, we show how to adapt data from other organizations to local problems. When privacy concerns block access, we show how to privatize data while still being able to mine it. When working with data of dubious quality, we show how to prune spurious information. When data or models seem too complex, we show how to simplify data mining results. When data is too scarce to support intricate models, we show methods for generating predictions. When the world changes, and old models need to be updated, we show how to handle those updates. When the effect is too complex for one model, we show how to reason across ensembles of models. Pre-requisites: This tutorial makes minimal use of maths of advanced algorithms and would be understandable by developers and technical managers.

UR - http://www.scopus.com/inward/record.url?scp=84886412343&partnerID=8YFLogxK

U2 - 10.1109/ICSE.2013.6606752

DO - 10.1109/ICSE.2013.6606752

M3 - Conference Paper

SN - 9781467330763

SP - 1484

EP - 1486

BT - 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings

ER -

Menzies T, Kocaguneli E, Peters F, Turhan B, Minku LL. Data science for software engineering. In 2013 35th International Conference on Software Engineering, ICSE 2013 - Proceedings. 2013. p. 1484-1486. 6606752 https://doi.org/10.1109/ICSE.2013.6606752