Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Advancing inverse folding models: exploring diverse optimization techniques

No data is associated with this publication.
Abstract

Inverse folding is an important task in protein engineering. This problem was once a significant challenge, but recent developments in the field of deep learning have led to the emergence of many effective models, exemplified by ProteinMPNN. However, some deficiencies remain in the field of inverse folding. On one hand, while the relationship between sequence and structure is many-to-many, most existing models focus solely on predicting the reference sequence corresponding to the original structure as the optimization objective during model training. On the other hand, the sequence recovery rate has long been the primary and often the only evaluation metric used to evaluate inverse folding models. Based on this situation, this work primarily explores two aspects: firstly, we designed and compared some strategies to improve the performance of the inverse folding model by using a modular approach with information beyond the reference sequence itself during the training process. Secondly, we introduced a very comprehensive series of assessments of model performance, which allows for a detailed comparison of the strengths and weaknesses of the models, and focuses on the practical applications of the models.

Main Content

This item is under embargo until July 5, 2026.