Multi-Label Protein Subcellular Localization Using Graph Attention and Self-Recalibrated Feature Representations

Authors

  • Baig Ayesha University of Gujrat
  • Jameel Usama Tianjin University
  • Syed Hassan Lal Gilani University of Gujrat
  • Maheen Asif University of Gujrat

DOI:

https://doi.org/10.59934/jaiea.v5i1.1710

Keywords:

Multi-label protein subcellular localization, Deep learning, Graph Attention Network (GAT), Self-attention-based feature recalibration (SAFR), Feature-Generative Adversarial Network (F-GAN), Canonical Correlation Analysis (CCA)

Abstract

Accurately predicting protein subcellular localization is essential for understanding biological function and informing medical research. To address the limitations of traditional laboratory techniques, this study introduces two deep learning frameworks—ML-FGAT and ML-GRat—for multi-label protein subcellular localization (ML-PSL). ML-FGAT integrates seven diverse feature encoding schemes—DC, PsePSSM, CTD, GO, CT, DDE, and EBGW [5]—followed by Differential Evolution (DE)-based feature fusion and entropy-guided selection. To enhance representation quality, a self-attention-based feature recalibration (SAFR) module is introduced to emphasize biologically relevant features. A FeatureGenerative Adversarial Network (F-GAN) then balances class distributions, and classification is performed using a Graph Attention Network (GAT). ML-FGAT achieved OAA scores ranging from 93.5% to 98.8% across five test datasets. ML-GRat uses DE for feature weighting and Canonical Correlation Analysis (CCA) for dimensionality reduction, followed by SAFR and a hybrid GAT-ResNet classifier. This model achieved OAA scores between 94.0% and 98.9% on six independent datasets, including SARS-CoV-2 and human proteins [6]. The proposed models demonstrate robust generalization, high predictive performance, and improved interpretability for ML-PSL tasks in computational biology.

Downloads

Download data is not yet available.

References

P. Bai, G. Li, J. Luo, and C. Liang, “Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training,” Brief. Bioinform., vol. 25, no. 6, bbae568, Nov. 2024.

Liang and Y. Qiu, “Predicting protein subcellular localization with multi-label deep learning,” Bioinformatics, vol. 38, no. 21, pp. 4941–4949, Nov. 2022.

K. L., S. Gao, S. Yao, F. Wu, and J. Li, “Gm-PLoc: A subcellular localization model of multi-label protein based on GAN and DeepFM,” Front. Genet., vol. 13, 912614, Aug.

H. Kobayashi, K. C. Cheveralls, M. D. Leonetti, and L. Royer, “Self-supervised deep learning of protein subcellular localization with cytoself,” Nat. Methods, vol. 19, pp. 995–1003, Jul. 2022.

Y. Pärnamaa and L. Parts, “Accurate classification of protein subcellular localization from high-throughput microscopy images using deep learning,” G3 Genes Genomes Genet., vol. 7, no. 5, pp. 1385–1392, May 2017.

Liu and Y. Wang, “Multi-marginal contrastive learning for multi-label subcellular protein localization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 12345–12354, Jun. 2022.

S. Wan and Q. Zou, “HPSLPred: An ensemble multi-label classifier for human protein subcellular location prediction with imbalanced source,” arXiv preprint, arXiv:1704.05204, Apr. 2017.

X. Zhang, Y. Tseo, Y. Bai, F. Chen, and C. Uhler, “Prediction of protein subcellular localization in single cells,” Nat. Biotechnol., vol. 42, pp. 123–130, Jul. 2024.

A. Razdaibiedina and A. Brechalov, “Learning multi-scale functional representations of proteins from single-cell microscopy data,” arXiv preprint, arXiv:2205.11676, May 2022.

V. Thumuluri, JJA Arenteros, AR Johansen, H. Nielsen, and O. Winther, “DeepLoc 2.0: multi-label subcellular localization prediction using protein language models,” Nucleic Acids Res. , vol. 50, no. W1, pp. W228–W234, Jul. 2022.

L. Wu, S. Gao, S. Yao, F. Wu, and J. Li, "Gm-PLoc: A subcellular localization model of multi-label protein based on GAN and DeepFM," Front. Genet., vol. 13, 912614, Jun. 2022.

M. Zeng, Y. Wu, Y. Liu, R. Yin, C. Zheng, J. Liu, and Y. Huang, "LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism," Brief. Bioinform. , vol. 24, no. 1, bbac560, Jan. 2023.

A. Razdaibiedina and A. Brechalov, “Learning multi-scale functional representations of proteins from single-cell microscopy data,” arXiv preprint , arXiv:2205.11676, May 2022.

M. Zeng, Y. Wu, Y. Li, R. Yin, C. Lu, J. Duan, and M. Li, “LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism,” Bioinformatics, vol. 39, no. 12, btad752, Dec. 2023.

Downloads

Published

2025-10-15

How to Cite

Baig Ayesha, Jameel Usama, Syed Hassan Lal Gilani, & Maheen Asif. (2025). Multi-Label Protein Subcellular Localization Using Graph Attention and Self-Recalibrated Feature Representations. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 5(1), 1737–1744. https://doi.org/10.59934/jaiea.v5i1.1710