Abstract:
|
Novel protein design is an important task in pharmaceutical research and development. Recent advances in technology enabled efficient protein design by mimicking natural evolutionary mutation, selection and amplification steps in a laboratory environment. However, due to the astronomically large number of possible polypeptide and amino acid sequences, it is difficult to explore the functionally interesting variants and it remains impossible to search for all combinations of sequences. In this work, we developed Machine Learning and Deep Learning methods for predicting properties based on their sequences. Our prediction models successfully identify the promising mutations of protein sequences in prospective time-split training and test data sets. The result indicates that the approach can potentially speed up the protein design process significantly.
|