Encoded in DNA, a protein can be represented as a string of hundreds of individual molecules called amino acids, linked together.
Depending on the specific combination of amino acids, a protein folds in a specific way, resulting in a functional three-dimensional shape. The shape does the trick and with 20 different amino acids available, the possible combinations are endless.
Current genomic technologies make it easy to know the amino acid sequence of a protein, but knowing its three-dimensional shape requires expensive and time-consuming experimental procedures, and these are not always successful.
For decades, researchers have tried to understand what makes a protein fold into a certain shape, to predict it from its amino acid sequence.
Alpha Fold 2
Alpha Fold 2 is a neural network developed by Deep Mind, an artificial intelligence (AI) company owned by Google, specially trained to solve the three-dimensional structure of proteins precisely from their amino acid sequence. Its accuracy impressed the scientific community a few years ago after its wins in the annual international CASP protein structure modeling competition, when its team presented the complete proteome for 11 different species, including humans.
To integrate all the data released by Alpha Fold 2 (more than 300,000 models and growing), a community of independent researchers including Eduard Porta, head of the Cancer Immunogenetics group at the Josep Carreras Leukemia Research Institute, compared the new structures made available to the available currently and concluded that Alpha Fold 2 contributed an additional 25% of high-quality protein structures in any given species. The analysis was published in the journal Nature Structural Biology.
The role that many proteins play in diseases, such as cancer, is already known, but the lack of deep knowledge of their function at the molecular level prevents the development of specific strategies against them. The structural information of these proteins will help scientists understand these proteins much better, know what other molecules they might interact with inside the cell, and design new drugs, able to interfere with their function when they are altered.
Restrictions
There are limitations to the capabilities of Alpha Fold 2. The community team found that the algorithm has problems when trying to recreate protein complexes. Most proteins work together with other proteins to perform a biological function, so predicting how different proteins might fit together would be highly desirable. Another limitation identified is its inability to show the structure of mutant proteins, with altered amino acids in their sequence. Mutations often lead to abnormal protein function and are the cause of many diseases such as cancer.
Despite the limitations, the team recognizes the contribution of Alpha Fold 2 to the community, which will greatly influence basic and biomedical research in the years to come. Not only thanks to his direct contribution (thousands of new reliable 3D protein models), but by ushering in a new era of AI-based computational tools capable of yielding results that no one can predict.
Meta enters the scene
Recently, a team at Meta (formerly Facebook) used a modified version of the natural language predictor to “autocomplete” proteins. The AI tool, called ESMFold, appears to be less accurate than Google’s counterpart, but it is 60 times faster and can overcome some of the identified limitations of Alpha Fold 2, such as handling mutated sequences.
Overall, as the authors of the publication admit, “the application of AlphaFold2 [and the coming tools] will have a transformative impact on the life sciences.”
#Google #Meta #boost #cancer #research #drug #discovery