eBlocks Gene Fragments can be synthesized for a variety of protein design applications
eBlocks Gene Fragments are uniquely suited for high-throughput screening of multiple constructs for protein design research. The authors used eBlocks Gene Fragments that code for the hallucinated proteins to identify new protein structures. Researchers can clone IDT Gene Fragments into vectors using cloning strategies such as seamless cloning .
An overview of protein design research
Protein design research focuses on editing naturally occurring peptides or designing oligomers de novo for a variety of applications, including therapeutics, industrial applications, agriculture, and sustainability. These proteins play important roles in biology from molecular binding to enzymatic catalysis and other protein functions. Protein oligomers that assemble from several identical subunits, also referred to as homo-oligomers, can be also utilized for studying protein structure due to their size and wide range of uses. Researchers are interested in studying these subunits to understand the overall structure of the protein and identify new protein structure designs. The challenge is that researchers have needed to identify the protein structure prior to design, and to have the capability to experimentally confirm that structure, which has limited protein design to what we already know, rather than explore the greater expanse of structural possibilities.
Identifying protein structures using the ProteinMPNN sequence design method, docking and computational approaches
Protein design typically starts with a hierarchical docking approach which is a set of protocols for predicting complex protein structures, starting with characterizing monomers and then higher-order structures. There are several challenges with the docking approach for predicting protein structures using in silico models and identifying not just subunit protomers, but the structure of a multi-subunit oligomer assembly. Further, the hierarchical docking approach can be considered insufficient due to restrictions of being able to understand protein-protein interactions.
ProteinMPNN is a deep learning protein sequence design method that is applicable for protein structure research. The authors used the ProteinMPNN sequence design method to generate the new protein sequences that might be better expressed in E. coli. One challenge they faced was the overfitting of the initial designs so that, when synthesized and cloned into expression systems, the sequences produced virtually no proteins with appreciable soluble expression. Ordering the protein coding DNA sequences as eBlocks Gene Fragments enabled them to rapidly screen various sequences using reliable synthetic biology techniques. Based on the high accuracy of the ProteinMPNN, researchers can use this approach to design novel sequences that can be experimentally validated.
The authors used the deep network hallucination to gain insight into identifying the subunits of the protein. They performed this method by hallucinating the space of the protein oligomeric structures, using the chain length and oligomer valency. Then, they used a computational approach to interpolate and extend native fold-space of the protein instead of relying on recapitalizing the known protein structures. By using Monte Carlo optimization, the authors were able to identify well-defined states of new structures.
Confirming and characterizing new protein structures using x-ray crystallization and cryogenic electron microscopy (cryo-EM)
There are several techniques for studying protein structures. Researchers can use x-ray crystallography to study protein structures at an atomic level. Crystallization is used for the separation and purification of proteins. The authors generated crystal structures to evaluate their design accuracy and solved 7 out of 19 designs.
Additionally, scientists performed cryo-EM, which is an imaging technique used for analysis and confirmation of protein structures. Due to the small molecular weight of their protein design structures, the authors performed high-resolution single-particle cryo-EM characterization.
Overall, Wicky et al. were able to use a combination of machine learning, synthetic biology with eBlocks Gene Fragments, and various proteomics techniques to identify new protein design structures. Deep network hallucination design of novel protein structures, confirmed experimentally, has the ability to deepen our knowledge of protein structure and expand our engineered protein arsenal to find solutions for many different applications. Learn more about how IDT Gene fragments can help you with your protein design research.