Identifying high-risk breast cancer patients using microarray and clinical data

Azni Nasuha Ngisa, Ong Huey Fang

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review


The performance of DNA microarray in breast cancer prediction demonstrates the potential of genome-wide analysis using gene expression data. Therefore, this study proposed a prediction method called GridPCA to identify highrisk breast cancer patients using both microarray and clinical data. The GridSearch and Principal Component Analysis are employed in the proposed method to deal with the high dimensionality of microarray data. The experimental results showed that GridPCA achieved approximately 82% of average predictive accuracy with Decision Tree, K-Nearest Neighbour, Logistic Regression and Support Vector Machine classifiers. In future, the proposed method could be used in developing systems that help doctors in planning, decision making and tailoring appropriate treatments for increasing the survival rate of breast cancer patients.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2020)
EditorsTaesung Park, Young-Rae Cho, Xiaohua Hu, Illhoi Yoo, Hyun Goo Woo, Jianxin Wang, Julio Facelli, Seungyoon Nam, Mingon Kang
Place of PublicationPiscataway NJ USA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Number of pages5
ISBN (Electronic)9781728162157
ISBN (Print)9781728162164
Publication statusPublished - 2020
EventIEEE International Conference on Bioinformatics and Biomedicine 2020 - Virtual, Seoul, Korea, South
Duration: 16 Dec 202019 Dec 2020 (Proceedings)


ConferenceIEEE International Conference on Bioinformatics and Biomedicine 2020
Abbreviated titleBIBM 2020
Country/TerritoryKorea, South
CityVirtual, Seoul
Internet address


  • breast cancer
  • clinical data
  • microarray
  • prediction

Cite this