PFresGO: an attention mechanism-based deep-learning approach for protein annotation by integrating gene ontology inter-relationships

Tong Pan, Chen Li, Yue Bi, Zhikang Wang, Robin B. Gasser, Anthony W. Purcell, Tatsuya Akutsu, Geoffrey I. Webb, Seiya Imoto, Jiangning Song

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)


MOTIVATION: The rapid accumulation of high-throughput sequence data demands the development of effective and efficient data-driven computational methods to functionally annotate proteins. However, most current approaches used for functional annotation simply focus on the use of protein-level information but ignore inter-relationships among annotations. RESULTS: Here, we established PFresGO, an attention-based deep-learning approach that incorporates hierarchical structures in Gene Ontology (GO) graphs and advances in natural language processing algorithms for the functional annotation of proteins. PFresGO employs a self-attention operation to capture the inter-relationships of GO terms, updates its embedding accordingly and uses a cross-attention operation to project protein representations and GO embedding into a common latent space to identify global protein sequence patterns and local functional residues. We demonstrate that PFresGO consistently achieves superior performance across GO categories when compared with 'state-of-the-art' methods. Importantly, we show that PFresGO can identify functionally important residues in protein sequences by assessing the distribution of attention weightings. PFresGO should serve as an effective tool for the accurate functional annotation of proteins and functional domains within proteins. AVAILABILITY AND IMPLEMENTATION: PFresGO is available for academic purposes at SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Original languageEnglish
Article numberbtad094
Number of pages8
Issue number3
Publication statusPublished - Mar 2023

Cite this