Collaborative data analytics towards prediction on pathogen-host protein-protein interactions

Huaming Chen, Jun Shen, Lei Wang, Jiangning Song

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

Nowadays more and more data are being sequenced and accumulated in system biology, which brings the data analytics researchers to a brand new era, namely "big data", to extract the inner relationship and knowledge from the huge amount of data. Bridging the gap between computational methodology and biology to accelerate the development of biology analytics has been a hot area. In this paper, we focus on these enormous amounts of data generated with the speedy development of high throughput technologies during the past decades, especially for protein-protein interactions, which are the critical molecular process in biology. Since pathogen-host protein-protein interactions are the major and basic problems for not only infectious diseases but also drug design, molecular level interactions between pathogen and host play very critical role for the study of infection mechanisms. In this paper, we built a basic framework for analyzing the specific problems about pathogen-host protein-protein interactions (PHPPI), meanwhile, we also presented the state-of-art deep learning method results on prediction of PHPPI comparing with other machine learning methods. Utilizing the evaluation methods, specifically by considering the high skewed imbalanced ratio and huge amount of data, we detailed the pipeline solution on both storing and learning for PHPPI. This work contributes as a basis for a further investigation of protein and protein-protein interactions, with the collaboration of data analytics results from the vast amount of data dispersedly available in biology literature.

Original languageEnglish
Title of host publication2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD)
EditorsWeiming Shen, Pedro Antunes, Nguyen Hoang Thuan, Jean-Paul Barthes, Junzhou Luo, Jianming Yong
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages269-274
Number of pages6
Edition1st
ISBN (Electronic)9781509061990
DOIs
Publication statusPublished - 12 Oct 2017
EventInternational Conference on Computer Supported Cooperative Work in Design 2017 - Wellington, New Zealand
Duration: 26 Apr 201728 Apr 2017
Conference number: 21st
https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8053876 (Proceedings)

Conference

ConferenceInternational Conference on Computer Supported Cooperative Work in Design 2017
Abbreviated titleCSCWD 2017
CountryNew Zealand
CityWellington
Period26/04/1728/04/17
Internet address

Keywords

  • big data
  • bioinformatics
  • machine learning
  • PHPPI

Cite this

Chen, H., Shen, J., Wang, L., & Song, J. (2017). Collaborative data analytics towards prediction on pathogen-host protein-protein interactions. In W. Shen, P. Antunes, N. H. Thuan, J-P. Barthes, J. Luo, & J. Yong (Eds.), 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD) (1st ed., pp. 269-274). [8066706] IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/CSCWD.2017.8066706
Chen, Huaming ; Shen, Jun ; Wang, Lei ; Song, Jiangning. / Collaborative data analytics towards prediction on pathogen-host protein-protein interactions. 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD). editor / Weiming Shen ; Pedro Antunes ; Nguyen Hoang Thuan ; Jean-Paul Barthes ; Junzhou Luo ; Jianming Yong. 1st. ed. IEEE, Institute of Electrical and Electronics Engineers, 2017. pp. 269-274
@inproceedings{13fddb8f3e544586954aba96c0e2fd0d,
title = "Collaborative data analytics towards prediction on pathogen-host protein-protein interactions",
abstract = "Nowadays more and more data are being sequenced and accumulated in system biology, which brings the data analytics researchers to a brand new era, namely {"}big data{"}, to extract the inner relationship and knowledge from the huge amount of data. Bridging the gap between computational methodology and biology to accelerate the development of biology analytics has been a hot area. In this paper, we focus on these enormous amounts of data generated with the speedy development of high throughput technologies during the past decades, especially for protein-protein interactions, which are the critical molecular process in biology. Since pathogen-host protein-protein interactions are the major and basic problems for not only infectious diseases but also drug design, molecular level interactions between pathogen and host play very critical role for the study of infection mechanisms. In this paper, we built a basic framework for analyzing the specific problems about pathogen-host protein-protein interactions (PHPPI), meanwhile, we also presented the state-of-art deep learning method results on prediction of PHPPI comparing with other machine learning methods. Utilizing the evaluation methods, specifically by considering the high skewed imbalanced ratio and huge amount of data, we detailed the pipeline solution on both storing and learning for PHPPI. This work contributes as a basis for a further investigation of protein and protein-protein interactions, with the collaboration of data analytics results from the vast amount of data dispersedly available in biology literature.",
keywords = "big data, bioinformatics, machine learning, PHPPI",
author = "Huaming Chen and Jun Shen and Lei Wang and Jiangning Song",
year = "2017",
month = "10",
day = "12",
doi = "10.1109/CSCWD.2017.8066706",
language = "English",
pages = "269--274",
editor = "Weiming Shen and Pedro Antunes and Thuan, {Nguyen Hoang} and Jean-Paul Barthes and Junzhou Luo and Jianming Yong",
booktitle = "2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD)",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States of America",
edition = "1st",

}

Chen, H, Shen, J, Wang, L & Song, J 2017, Collaborative data analytics towards prediction on pathogen-host protein-protein interactions. in W Shen, P Antunes, NH Thuan, J-P Barthes, J Luo & J Yong (eds), 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD). 1st edn, 8066706, IEEE, Institute of Electrical and Electronics Engineers, pp. 269-274, International Conference on Computer Supported Cooperative Work in Design 2017, Wellington, New Zealand, 26/04/17. https://doi.org/10.1109/CSCWD.2017.8066706

Collaborative data analytics towards prediction on pathogen-host protein-protein interactions. / Chen, Huaming; Shen, Jun; Wang, Lei; Song, Jiangning.

2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD). ed. / Weiming Shen; Pedro Antunes; Nguyen Hoang Thuan; Jean-Paul Barthes; Junzhou Luo; Jianming Yong. 1st. ed. IEEE, Institute of Electrical and Electronics Engineers, 2017. p. 269-274 8066706.

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

TY - GEN

T1 - Collaborative data analytics towards prediction on pathogen-host protein-protein interactions

AU - Chen, Huaming

AU - Shen, Jun

AU - Wang, Lei

AU - Song, Jiangning

PY - 2017/10/12

Y1 - 2017/10/12

N2 - Nowadays more and more data are being sequenced and accumulated in system biology, which brings the data analytics researchers to a brand new era, namely "big data", to extract the inner relationship and knowledge from the huge amount of data. Bridging the gap between computational methodology and biology to accelerate the development of biology analytics has been a hot area. In this paper, we focus on these enormous amounts of data generated with the speedy development of high throughput technologies during the past decades, especially for protein-protein interactions, which are the critical molecular process in biology. Since pathogen-host protein-protein interactions are the major and basic problems for not only infectious diseases but also drug design, molecular level interactions between pathogen and host play very critical role for the study of infection mechanisms. In this paper, we built a basic framework for analyzing the specific problems about pathogen-host protein-protein interactions (PHPPI), meanwhile, we also presented the state-of-art deep learning method results on prediction of PHPPI comparing with other machine learning methods. Utilizing the evaluation methods, specifically by considering the high skewed imbalanced ratio and huge amount of data, we detailed the pipeline solution on both storing and learning for PHPPI. This work contributes as a basis for a further investigation of protein and protein-protein interactions, with the collaboration of data analytics results from the vast amount of data dispersedly available in biology literature.

AB - Nowadays more and more data are being sequenced and accumulated in system biology, which brings the data analytics researchers to a brand new era, namely "big data", to extract the inner relationship and knowledge from the huge amount of data. Bridging the gap between computational methodology and biology to accelerate the development of biology analytics has been a hot area. In this paper, we focus on these enormous amounts of data generated with the speedy development of high throughput technologies during the past decades, especially for protein-protein interactions, which are the critical molecular process in biology. Since pathogen-host protein-protein interactions are the major and basic problems for not only infectious diseases but also drug design, molecular level interactions between pathogen and host play very critical role for the study of infection mechanisms. In this paper, we built a basic framework for analyzing the specific problems about pathogen-host protein-protein interactions (PHPPI), meanwhile, we also presented the state-of-art deep learning method results on prediction of PHPPI comparing with other machine learning methods. Utilizing the evaluation methods, specifically by considering the high skewed imbalanced ratio and huge amount of data, we detailed the pipeline solution on both storing and learning for PHPPI. This work contributes as a basis for a further investigation of protein and protein-protein interactions, with the collaboration of data analytics results from the vast amount of data dispersedly available in biology literature.

KW - big data

KW - bioinformatics

KW - machine learning

KW - PHPPI

UR - http://www.scopus.com/inward/record.url?scp=85032356964&partnerID=8YFLogxK

U2 - 10.1109/CSCWD.2017.8066706

DO - 10.1109/CSCWD.2017.8066706

M3 - Conference Paper

SP - 269

EP - 274

BT - 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD)

A2 - Shen, Weiming

A2 - Antunes, Pedro

A2 - Thuan, Nguyen Hoang

A2 - Barthes, Jean-Paul

A2 - Luo, Junzhou

A2 - Yong, Jianming

PB - IEEE, Institute of Electrical and Electronics Engineers

ER -

Chen H, Shen J, Wang L, Song J. Collaborative data analytics towards prediction on pathogen-host protein-protein interactions. In Shen W, Antunes P, Thuan NH, Barthes J-P, Luo J, Yong J, editors, 2017 IEEE 21st International Conference on Computer Supported Cooperative Work in Design (CSCWD). 1st ed. IEEE, Institute of Electrical and Electronics Engineers. 2017. p. 269-274. 8066706 https://doi.org/10.1109/CSCWD.2017.8066706