Coarse-Grained Strategy for Modeling Protein Stability in Concentrated Solutions
We present a coarse-grained approach for modeling the thermodynamic stability of single-domain globular proteins in concentrated aqueous solutions. Our treatment derives effective protein-protein interactions from basic structural and energetic characteristics of the native and denatured states. These characteristics, along with the intrinsic (i.e., infinite dilution) thermodynamics of folding, are calculated from elementary sequence information using a heteropolymer collapse theory. We integrate this information into Reactive Canonical Monte Carlo simulations to investigate the connections between protein sequence hydrophobicity, protein-protein interactions, protein concentration, and the thermodynamic stability of the native state. The model predicts that sequence hydrophobicity can affect how protein concentration impacts native-state stability in solution. In particular, low hydrophobicity proteins are primarily stabilized by increases in protein concentration, whereas high hydrophobicity proteins exhibit richer nonmonotonic behavior. These trends appear qualitatively consistent with the available experimental data. Although factors such as pH, salt concentration, and protein charge are also important for protein stability, our analysis suggests that some of the nontrivial experimental trends may be driven by a competition between destabilizing hydrophobic protein-protein attractions and entropie crowding effects.
Proteins in their native states play an important role in many biological processes and pharmaceutical applications. They participate in almost every aspect of the biochemical transport and regulation required for living organisms. They also serve as therapeutic drugs for targeting infectious diseases and cancer. However, most proteins, under most conditions, exhibit only marginal thermodynamic stability. As a result, minor sequence mutations or even small perturbations to solution parameters (e.g., pH, temperature, concentration, etc.) can result in protein denaturation (1).
This has important practical consequences. Unfolded or misfolded proteins look and behave differently than they do in their native states (2). They lack the highly specific molecular structure necessary for biological activity. Moreover, since unfolded proteins generally expose a significant number of hydrophobic core residues to the aqueous solvent, they have a tendency to associate and form non-native aggregates in solution. Unwanted protein aggregation and subsequent precipitation pose enormous problems in biological and pharmaceutical contexts (3,4). These processes are connected, although in a manner still imperfectly understood, to a number of debilitating diseases such as Parkinson's, Alzheimer's, Huntington's, and Down's syndrome (5-9). They are also known to cause rapid degradation of pharmaceutical formulations, reducing the shelf life of promising new drugs and restricting the strategies available for purification, handling, and delivery of therapeutics (10-15).
Given the technological and practical importance of pro tein stability, there is an urgent need to develop a generic understanding of the thermodynamic driving forces for protein unfolding, aggregation, crystallization, and phase separation. One way to facilitate this understanding is to build models that can account for, at various levels of sophistication, three vital aspects of stability for the native and denatured states: the relationship between protein sequence and structure, protein-protein interactions, and the global phase behavior of protein solutions. This is a formidable challenge because proteins are inherently large and complex molecules. Moreover, proteins encounter a wide variety of solution environments in biological and pharmaceutical processes, exposing them to thermal, mechanical, osmotic, and chemical stresses. To describe protein behavior under these conditions, one must also have a reliable method for accounting for hydrophobic interactions (16-23), which are a dominant force (24,25) in biomolecular folding and assembly events. These interactions have been particularly challenging to model because they exhibit subtle dependencies on both the state of the solution and the size and shape of the participating solutes.
Although each of the aforementioned aspects of protein stability have been long appreciated, theoretical investigations have focused more on their independent study than on devising strategies for integrating them into a single model. For example, models developed to investigate the single-molecule protein folding problem (26-30) are almost exclusively too complicated to be extended, either theoretically or via computer simulation, to investigate the collective behavior of thousands of proteins and millions of water molecules in solution. On the other hand, many recent theoretical models introduced to study the thermodynamics of crystallization or liquid-liquid phase separation in protein solutions (31-40), although insightful in many respects, do not consider sequence information, the polymeric character of the individual proteins, or the possibility of protein unfolding. Despite the development of some powerful coarse-grained models that address some of these issues (41-60), theoretical methods that can provide a comprehensive understanding of the protein stability problem are still lacking.
<< Home