Abstract:
Predicting the tertiary structure of the peptides with non-canonical amino acids (NCAAs) remains a big hurdle in computational biology, despite it being a rapidly growing class of therapeutics. Existing methods for peptide structure prediction are largely limited to natural amino acids, while the state-of-the-art all-atom models capable of handling NCAAs, such as AlphaFold 3 and Boltz-1, require substantial GPU infrastructure which is inaccessible to most research groups. Classical approaches like PEPstrMOD relies on force-fields, but are limited to NCAAs with pre-existing force field parameters, leaving the majority of chemical modification space poorly covered. To overcome these issues, we developed Alpha-Mod, a hybrid and computationally efficient pipeline for tertiary structure prediction of peptides containing non-canonical amino acids. Alpha-Mod employs a divide-and-conquer strategy by getting the backbone structure prediction from AlphaFold 2, while the ET-Flow generates the three-dimensional conformer of each of the NCAA independently from its SMILES representation in isolation. These are merged using the Kabsch algorithm for anchor-atom superimposition and refined with the MACE-OFF23 machine learning force field to eliminate steric clashes without needing residue-specific parameters. Alpha-Mod was benchmarked on three datasets: ModPep 257 (n=257), ModPep 16 (n=16), and a newly curated dataset PEP_SOLO (n=23). On ModPep 257, Alpha-Mod achieved a mean Cα RMSD of 3.65 Å, outperforming AlphaFold 3 (4.14 Å) and PEPstrMOD (4.07 Å). Secondary structure recovery on ModPep 257 yielded a Q3 accuracy of 95.43% for Alpha-Mod versus 74.05% for AlphaFold 3. On ModPep 16, Alpha-Mod achieved a mean Cα RMSD of 2.40 Å compared to 4.35 Å for PEPstrMOD, representing a significant improvement on structured modified peptides. These results demonstrate that modular integration of bioinformatic and cheminformatic tools can achieve competitive or superior structural accuracy relative to all-atom deep learning models for a large and therapeutically relevant subset of the modified peptide space, while operating at a fraction of the computational cost and without requiring model retraining or residue-specific force field parameters. Alpha-Mod is available on Github and also a colab notebook has been provided for better accessibility to the research community.