HAIRpred: Prediction of human antibody interacting residues in an antigen from its primary structure

SAHNI, RUCHIR; Raghava, Gajendra P. S.; Kumar, Nishant

DR Home
→
PUBLICATIONS & PATENTS
→
JOURNAL ARTICLES
→
View Item

dc.contributor.author	SAHNI, RUCHIR	en_US
dc.contributor.author	Kumar, Nishant	en_US
dc.contributor.author	Raghava, Gajendra P. S.	en_US
dc.date.accessioned	2025-07-25T05:22:59Z
dc.date.available	2025-07-25T05:22:59Z
dc.date.issued	2025-08	en_US
dc.identifier.citation	Protein Science, 34(08).	en_US
dc.identifier.issn	0961-8368	en_US
dc.identifier.issn	1469-896X	en_US
dc.identifier.uri	https://doi.org/10.1002/pro.70212	en_US
dc.identifier.uri	http://dr.iiserpune.ac.in:8080/xmlui/handle/123456789/10318
dc.description.abstract	In the past, several methods have been developed for predicting conformational B-cell epitopes in antigens that are not specific to any host. Our primary analysis of antibody–antigen complexes indicated a need to develop host-specific B-cell epitopes. In this study, we present a novel approach to predict conformational B-cell epitopes specific to human hosts by focusing on human antibody interacting residues in antigens. We trained, tested, and evaluated our models on 277 complexes of human antibody–antigen complexes. Initially, we employed machine learning models based on the one hot encoding sequence profile of antigens, achieving a maximum area under the receiver operating characteristic curve (AUROC) of 0.61. The performance of the model improved significantly with the AUROC increasing from 0.61 to 0.67 when evolutionary profiles were used instead of one hot encoding profile. Models developed using embeddings from fine-tuned protein language models reached an AUROC of 0.61. Additionally, models utilizing predicted surface relative solvent accessibility achieved an AUROC of 0.67. Our ensemble model, which combined relative surface accessibility with evolutionary profiles, achieved the highest precision with an AUROC of 0.72. All models in this study were trained using fivefold cross-validation on a training dataset and evaluated on an independent dataset not used for training or validation. Our method outperforms existing approaches on the independent dataset. Furthermore, we used the SHAP eXplainable AI (XAI) method to interpret the importance of elements in features contributing to the predictions made by our models. To support the scientific community, we have developed a standalone software and web server, HAIRpred, for predicting human antibody interacting residues in proteins	en_US
dc.language.iso	en	en_US
dc.publisher	Wiley	en_US
dc.subject	Antibody-antigen interaction	en_US
dc.subject	Antibody interacting residues	en_US
dc.subject	B-cell epitopes	en_US
dc.subject	Machine learning	en_US
dc.subject	Protein language models	en_US
dc.subject	2025-JUL-WEEK4	en_US
dc.subject	TOC-JUL-2025	en_US
dc.subject	2025	en_US
dc.title	HAIRpred: Prediction of human antibody interacting residues in an antigen from its primary structure	en_US
dc.type	Article	en_US
dc.contributor.department	Dept. of Data Science	en_US
dc.identifier.sourcetitle	Protein Science	en_US
dc.publication.originofpublisher	Foreign	en_US