Tuesday, January 21, 2025

thumbnail

Protein Expression 101: The Rules You Can’t Ignore


 

Introduction

The size of a protein influences the level of difficulty you will have in its expression. Other limitations also exist, such as the location, whether it is a transmembrane protein, or how soluble the protein is. So let's take a look at some of the rules or considerations. 


Outline:

1. Protein Size Considerations for Expression

2. Practical Guidelines for Protein Size and System Choice

3. Examples of Proteins by Size Category


No time to sit and read? You can watch the video here:


Small Proteins (<20 kDa)

In general protein sizes that are quite small -  say less than 20 kilodalton (kDa) -  are easier to express in e. coli.  This is due to their simplicity in folding requirements and reduced likelihood of aggregation. However, they can require fusion tags for stability and detection during their purification process. 


Medium-Sized Proteins (20 - 70 kDa)

The sweet spot for protein expression tends to be proteins that are between 20 and 70 kDa.  These proteins are often suitable for expression in most systems. Generally the systems for protein expression include e. coli, yeast and insect cells. Many structural studies use  proteins that are in this size range, due to their solubility and manageability in terms of their folding requirements. 


Large Proteins (>70 kDa)

Proteins that are greater than 70 kDa, tend to encounter challenges when they are trying to fold. You tend to get misfolding, you get aggregation and  you may get inclusion bodies. Hence proteins greater than 70 kDa are definitely not for expression in e. coli systems. However,  you can co-express these proteins with things called molecular chaperones. You may also improve the amount of soluble product by optimizing the codons. Codon optimisation involves replacing certain codons in the gene with more commonly used codons, provided that they are synonymous. This ensures that the sequence that you have, corresponds with the most highly used codons in your host organism.  Host systems like yeast or mammalian cells may be better suited when you have any protein that's above 70 kilodalton. and that's because these hosts can handle complex folding or post-translational  modifications.


Very Large Proteins (>150 kDa)


Proteins that are considered very large are those that are greater than 150 kDa. These proteins are definitely not suitable for e.coli systems, once  again, due to folding and solubility issues. In such cases, insect or mammalian expression systems are your preferred option. Systems like baculovirus expression vector system (BEVs), or Chinese Hamster Ovary (CHO) cells, may be a better option for such large proteins. 


Membrane Proteins

Membrane proteins of any size tend to give issues due to the presence of hydrophobic domains. They need to have part of their domains in non-water or fluid exposed areas. Several tricks, including special detergents, co-expression with accessory proteins or host specific adaptations are usually employed. Another strategy is that the researcher may express the soluble fragment and study that portion of the protein.


Examples of Proteins by Size Category

Protein Size

Example Protein

Function

Comments

Small Proteins (<20 kDa)

Insulin (5.8 kDa)

Hormone regulating blood glucose

Small, easy to express; often used in pharmaceutical production with fusion tags for stability.


Lysozyme (14.3 kDa)

Enzyme breaking down bacterial cell walls

Commonly expressed in E. coli for structural and functional studies.


Cytochrome c (12 kDa)

Electron transport in mitochondria

Requires careful folding; often expressed in yeast or bacterial systems.

Medium-Sized Proteins (20–70 kDa)

Green Fluorescent Protein (GFP, 27 kDa)

Fluorescent marker for imaging

Highly soluble, widely expressed in E. coli.


Lactalbumin (37 kDa)

Regulates lactose production

Soluble in bacterial and mammalian systems; requires minimal folding assistance.


p53 (53 kDa)

Tumor suppressor protein

Typically expressed in mammalian systems to preserve post-translational modifications.

Large Proteins (70–150 kDa)

β-Galactosidase (116 kDa)

Enzyme breaking down lactose

Expressed in E. coli; solubility issues mitigated with optimized codon usage.


IgG Antibody (150 kDa)

Immune response mediator

Requires mammalian systems (e.g., CHO cells) for glycosylation and correct assembly.


Tubulin (110 kDa)

Cytoskeletal component

Expressed in insect or mammalian cells for proper folding and dimerization.

Very Large Proteins (>150 kDa)

RNA Polymerase II (500 kDa)

Transcription of DNA to RNA

Expressed as subunits in E. coli or insect cells, then assembled in vitro.


Myosin (200–220 kDa)

Molecular motor involved in muscle contraction

Typically requires mammalian cells to retain native folding and activity.


mTOR Complex (>1,000 kDa)

Key regulator of cell growth and metabolism

Expressed in mammalian cells as part of multiprotein complexes.

Membrane Proteins (Any Size)

GPCR (40–130 kDa)

Signal transduction

Often requires mammalian or insect cells with detergents for solubilization and stabilization.


ATP Synthase (F1 subunit, 100 kDa)

Energy production in mitochondria

Large complex; subunits expressed individually in E. coli or yeast.


Aquaporin (30 kDa)

Water channel in cell membranes

Expressed in E. coli with detergents or fusion proteins for stability.


Bibliography

  1. Bernaudat, F., Frelet-Barrand, A., Pochon, N., Dementin, S., Hivin, P., Boutigny, S., Rioux, J., Salvi, D., Seigneurin-Berny, D., Richaud, P., Joyard, J., Pignol, D., Sabaty, M., Desnos, T., Pebay‐Peyroula, E., Darrouzet, E., Vernet, T., & Rolland, N. (2011). Heterologous Expression of Membrane Proteins: Choosing the Appropriate Host. PLoS ONE, 6. https://doi.org/10.1371/journal.pone.0029191.

  2. Rennig, M., Daley, D., & Nørholm, M. (2018). Selection of Highly Expressed Gene Variants in Escherichia coli Using Translationally Coupled Antibiotic Selection Markers.. Methods in molecular biology, 1671, 259-268 . https://doi.org/10.1007/978-1-4939-7295-1_16.

  3. Schuster, M., Wasserbauer, E., Einhauer, A., Ortner, C., Jungbauer, A., Hammerschmid, F., & Werner, G. (2000). Protein Expression Strategies for Identification of Novel Target Proteins. Journal of Biomolecular Screening, 5, 89 - 97. https://doi.org/10.1177/108705710000500205.

  4. Nehlsen, K., Schucht, R., Da Gama-Norton, L., Krömer, W., Baer, A., Cayli, A., Hauser, H., & Wirth, D. (2009). Recombinant protein expression by targeting pre-selected chromosomal loci. BMC Biotechnology, 9, 100 - 100. https://doi.org/10.1186/1472-6750-9-100.

  5. Marco, A. (2007). Protocol for preparing proteins with improved solubility by co-expressing with molecular chaperones in Escherichia coli. Nature Protocols, 2, 2632-2639. https://doi.org/10.1038/nprot.2007.400.


  1. Ahn, J., Keum, J., & Kim, D. (2008). High-throughput, combinatorial engineering of initial codons for tunable expression of recombinant proteins. Journal of proteome research, 7 5, 2107-13 . https://doi.org/10.1021/pr700856s.

  2. Dieckman, L., Zhang, W., Rodi, D., Donnelly, M., & Collart, F. (2006). Bacterial expression strategies for human angiogenesis proteins. Journal of Structural and Functional Genomics, 7, 23-30. https://doi.org/10.1007/s10969-006-9006-z.

  3. Kim, K., Yang, J., Waldo, G., Terwilliger, T., & Suh, S. (2008). From no expression to high-level soluble expression in Escherichia coli by screening a library of the target proteins with randomized N-termini. Methods in molecular biology, 426, 187-95 . https://doi.org/10.1007/978-1-60327-058-8_11.

Subscribe by Email

Follow Updates Articles from This Blog via Email

No Comments

About

Search This Blog

Powered by Blogger.

About Me

My photo
Adwoa Biotech Tools and Techniques Hub offers clear, practical explanations of essential molecular biology and biotechnology methods. Learn PCR primer design, cDNA synthesis, cloning strategies, nucleic acid purification, CRISPR delivery innovations, data analysis concepts, and everyday lab skills. Enjoyed the tutorial, connect with me on YouTube for video content on these topics: @adwoabiotech