Wednesday, June 24, 2026

thumbnail

What Is Next-Generation Sequencing? NGS vs Sanger Explained


The Human Genome Project generated the first reference sequence of the human genome, representing approximately 3.1–3.2 billion base pairs. The project began in 1990, produced a draft genome in 2001, and by 2003 had generated a high-quality reference covering approximately 92% of the genome. Most of the remaining gaps were highly repetitive regions that were difficult to sequence. In 2022, the first gap-free human genome assembly was completed, marking the culmination of more than three decades of work.

Today, thanks to next-generation sequencing, or NGS, an entire human genome can be sequenced in about a day, although the complete laboratory workflow—including sample preparation, sequencing, and data analysis—typically takes longer. This dramatic improvement in speed is possible because NGS sequences millions to billions of DNA fragments simultaneously. In contrast, Sanger sequencing analyzes individual DNA fragments in separate reactions, making it far less suitable for large-scale genome sequencing.

Many NGS workflows align sequencing reads to an existing reference genome, such as the human reference genome produced through the Human Genome Project. However, NGS can also be used for de novo genome assembly when no reference genome exists. The basic principle behind NGS is that DNA is fragmented into many short pieces, each of which is sequenced independently. The resulting sequences, known as reads, are then computationally assembled or aligned to reconstruct the original DNA sequence. NGS can be used to sequence both DNA and RNA.

The workflow begins with sample collection, followed by purification of DNA or RNA. The nucleic acids are then assessed to ensure they are of sufficient quality and are not degraded. For most RNA sequencing applications, RNA is first converted into complementary DNA, or cDNA, through reverse transcription before library preparation.

A sequencing library is then prepared from the DNA or cDNA. A library consists of many short DNA fragments derived from a longer DNA molecule. These fragments are generated by mechanically shearing the DNA using high-frequency sound waves or by enzymatic fragmentation.

Specialized DNA sequences called adapters are then ligated to both ends of each DNA fragment. These adapters contain the sequences required for binding to the flow cell, sequencing primer binding sites, sample indices or barcodes for multiplexing, and, in some workflows, unique molecular identifiers. After ligation, excess unbound adapters are removed, completing library preparation.

Depending on the application, the library may undergo PCR amplification to increase the amount of DNA available for sequencing. Many modern whole-genome sequencing workflows are PCR-free to minimize amplification bias. Before sequencing, the library is assessed to confirm that the fragment size distribution and DNA concentration meet the instrument's requirements.

One of the most widely used NGS technologies is Illumina sequencing, which uses a method known as sequencing by synthesis.

Sequencing takes place on a glass flow cell coated with millions of short DNA oligonucleotides. These oligonucleotides are complementary to the adapter sequences attached to the library fragments.

First, the double-stranded library is denatured to produce single-stranded DNA molecules. These strands bind to complementary oligonucleotides on the flow cell surface. DNA polymerase then synthesizes the complementary strand, after which the original template strand is removed, leaving a single DNA strand attached to the flow cell.

At this stage, the fluorescent signal from a single DNA molecule would be too weak for reliable detection. Therefore, each bound DNA fragment undergoes bridge amplification, also known as solid-phase PCR, to generate a cluster of genetically identical DNA molecules.

During bridge amplification, the attached DNA strand bends over and hybridizes with a nearby oligonucleotide on the flow cell, forming a bridge. DNA polymerase synthesizes the complementary strand, creating a double-stranded bridge. The strands are then denatured, and the process repeats multiple times. This produces a localized cluster containing thousands of identical DNA copies. One strand of each cluster is then removed, leaving single-stranded templates ready for sequencing.

A sequencing primer is added, followed by DNA polymerase and four fluorescently labeled reversible terminator nucleotides corresponding to A, T, G, and C. During each sequencing cycle, only one nucleotide is incorporated because the reversible terminator temporarily blocks further DNA synthesis.

After nucleotide incorporation, high-resolution imaging captures the fluorescent signal emitted by each cluster, identifying the base that was added. The fluorescent label and blocking group are then chemically removed, allowing the next sequencing cycle to begin. This process is repeated for the programmed read length.

After completion of the first sequencing read, index reads are generated to identify the sample from which each DNA fragment originated. In paired-end sequencing, additional chemistry regenerates the complementary strand so that the opposite end of the original DNA fragment can also be sequenced. Unique dual indexing increases multiplexing capacity while reducing index hopping, although the maximum number of samples that can be pooled depends on the indexing kit and sequencing platform.

Once sequencing is complete, image analysis software converts the fluorescent signals into DNA sequences. Low-quality reads are filtered out. On patterned flow cells, overlapping clusters are largely eliminated, although polyclonal clusters—where more than one library fragment occupies the same nanowell—may still occur and are also removed during quality filtering.

The remaining high-quality reads are then demultiplexed using their index sequences to assign each read to its original sample.

Depending on the application, the reads are either aligned to a reference genome or assembled de novo. During alignment, overlapping reads reconstruct the original DNA sequence. In paired-end sequencing, the software recognizes that the two reads originate from opposite ends of the same DNA fragment, improving alignment accuracy, particularly across repetitive or structurally complex genomic regions.

An important sequencing metric is read depth, also known as sequencing depth, which refers to the number of sequencing reads covering a particular nucleotide position. Average read depth describes the average coverage across the target region. Approximately 30× average depth is considered standard for whole-genome sequencing, while targeted oncology assays may use average depths of around 1,500× to detect rare somatic mutations.

Another key metric is coverage, or breadth of coverage, which refers to the proportion of the target genome or genomic region represented by sequencing reads. High coverage ensures that few or no regions are missed during sequencing.

NGS has transformed both research and clinical practice. It is widely used for diagnosing rare genetic disorders, identifying inherited and somatic variants, guiding cancer treatment, monitoring infectious diseases, studying microbial communities, and supporting research across medicine, ecology, agriculture, and evolutionary biology.

Both DNA and RNA can be sequenced using NGS. Applications include whole-genome sequencing, whole-exome sequencing, targeted gene panels, transcriptome sequencing, and sequencing of coding and non-coding RNAs such as microRNAs and long non-coding RNAs. Specialized sequencing methods also enable the analysis of cell-free DNA, single cells, DNA methylation, chromatin accessibility, and protein–DNA interactions through techniques such as ChIP-seq.



Related post:

Sanger Sequencing Explained: How One Missing Oxygen Changed DNA Sequencing Forever

https://adwoabiotech.blogspot.com/2026/06/sanger-sequencing-explained-how-one.html

thumbnail

Sanger Sequencing Explained: How One Missing Oxygen Changed DNA Sequencing Forever



In 1977 Frederick Sanger described a method of DNA sequencing using chain-terminating nucleotides called dideoxynucleotides. 

The aim was to determine the sequence of nucleotides in a piece of DNA using these artificial or synthetic nucleotide analogues. Unlike natural DNA nucleotides (deoxyribonucleoside triphosphates, dNTPs), these lab-synthesised forms lack the 3'-hydroxyl group required for DNA strand extension.

This method became known as Sanger sequencing.

These chain-terminating nucleotides are called dideoxyribonucleoside triphosphates (ddNTPs).

DNA is made up of a chain of four different nucleotides called dNTPs. To copy DNA and extend a DNA strand, DNA polymerase adds a complementary nucleotide.

A closer look at its structure shows that a dNTP consists of a deoxyribose sugar, a nitrogenous base, and a triphosphate group. A nucleoside consists of a sugar and a base. In DNA the sugar is deoxyribose, while in RNA it is ribose. The base is one of four bases: adenine, thymine, guanine, or cytosine.

A ddNTP lacks both the 2'-OH and 3'-OH groups found in ribose. Compared with deoxyribose, it is missing the 3'-OH group.

The role of DNA polymerase is to add new nucleotides to a growing DNA strand. During DNA synthesis, the 3'-hydroxyl group (3'-OH) of the growing DNA strand reacts with the α-phosphate of the incoming dNTP, forming a phosphodiester bond and releasing pyrophosphate.

If a ddNTP is incorporated into the DNA strand, synthesis stops because the ddNTP lacks the 3'-OH group required to add the next nucleotide. This absence of a 3'-OH group terminates DNA chain elongation.

It is also useful to understand the naming convention of 5' (five-prime) and 3' (three-prime). The carbons in the deoxyribose sugar are numbered 1' through 5'. The nitrogenous base is attached to the 1' carbon, while the phosphate group is attached to the 5' carbon.

The 3'-OH group attached to the 3' carbon is the chemical group required for DNA strand extension. Because DNA polymerase adds new nucleotides to the existing chain utilising the phosphate group of the new dNTP. Hence, the dogma that DNA synthesis proceeds in the 5'→3' direction. And when DNA sequences are written, they are conventionally written from 5' to 3'.

DNA polymerase adds nucleotides complementary to the template strand, so C pairs with G and A pairs with T.

So how does Sanger sequencing work?

The original Sanger sequencing method was different from the one used today. The original method was completely manual and used radioactive labels.

Let's take a look at the original Sanger sequencing method.

We need a primer, DNA polymerase, dNTPs, a DNA template, and ddNTPs.

One of the dNTPs, usually dATP, is labeled with a radioactive isotope.

A total of four tubes are used, one for each ddNTP.

To begin, the DNA template is heated to denature the double-stranded DNA into single strands. Remember, this was before PCR existed. Because the DNA polymerases available at the time were not thermostable, the enzyme was added after the denaturation step.

The mixture is then cooled to allow the sequencing primer to anneal to the template.

DNA polymerase, all four dNTPs, and one of the four ddNTPs are then added to each tube.

DNA polymerase extends the primer along the DNA template. Occasionally, a ddNTP is incorporated instead of a dNTP, terminating the DNA fragment.

Because the ddNTP is present at a much lower concentration than the corresponding dNTP, incorporation occurs randomly.

The result is a collection of DNA fragments that terminate at every occurrence of that particular base, generating fragments of different lengths.

All fragments in a tube begin with the same primer sequence and end with the same terminating nucleotide.

Low incorporation of the ddNTP allows longer stretches of DNA to be sequenced.

In the original Sanger method, read lengths of approximately 200 nucleotides were achievable.

Next, the sequencing reactions are mixed with loading dye and loaded into separate lanes of a polyacrylamide gel.

The fragments migrate through the gel according to size, with smaller fragments moving faster than larger fragments.

Polyacrylamide gels have sufficient resolution to distinguish DNA fragments that differ by a single nucleotide in length.

At this stage the fragments cannot be seen.

The loading dye indicates when the fragments have migrated through the gel.

To visualise the DNA fragments, the gel is dried onto a support and exposed to X-ray film.

The radioactive labels incorporated into the DNA fragments expose the film, producing a pattern of bands.

The process of determining the DNA sequence from these bands is called base calling.

The gel is read from the bottom upward, starting with the shortest fragment. This reveals the sequence of the newly synthesized DNA strand in the 5'→3' direction.

For example, if the shortest fragment appears in the ddTTP lane, the first base called is T. If the next shortest fragment appears in the ddGTP lane, the next base is G.

Continuing upward through the gel reveals the complete sequence.

The original Sanger sequencing method was very labor-intensive. It could take several days to generate approximately 200 nucleotides of sequence from only a small number of samples.

There was a strong need to streamline and automate the process.

Applied Biosystems created the first commercial automated DNA sequencing instrument in 1987, the AB370A.

Researchers had already demonstrated that fluorescent dyes could replace radioactive labels. These fluorescent dyes were safer and eliminated the need for time-consuming X-ray film detection.

In this instrument, each of the four sequencing reactions was labeled with a different fluorescent dye.

After the sequencing reactions were completed, all four reactions could be combined and loaded into a single lane of a gel.

The AB370A used a laser to detect fluorescent DNA fragments as they migrated through the gel.

The instrument automatically transferred the data to a computer, which performed automated base calling.

Up to 16 samples could be run simultaneously, with read lengths approaching 450 nucleotides.

The AB370A demonstrated that DNA sequencing could be faster and more automated.

Scientists began to think that sequencing the entire human genome might be achievable.

In 1990 the U.S. government launched the Human Genome Project, an international effort to map and sequence the entire human genome.

Sequencing the human genome promised major advances in biology and medicine, including identifying genes associated with inherited diseases and improving our understanding of human biology.

Kary Mullis had invented PCR in 1983, but it was not until 1989 that Vincent Murray applied thermostable Taq polymerase to Sanger sequencing.

In traditional Sanger sequencing, most labeled primers remain unused because the primer is present in excess relative to the DNA template.

The use of Taq polymerase allowed repeated cycles of denaturation, primer annealing, and extension, similar to PCR.

Because only a single sequencing primer is present, newly synthesised strands do not serve as templates for exponential amplification.

As a result, the amount of sequencing product increases approximately linearly rather than exponentially. This process became known as cycle sequencing.

The increased signal generated by cycle sequencing also reduced the amount of input DNA required.

Another important advance was capillary electrophoresis.

In capillary electrophoresis, DNA fragments migrate through a thin capillary filled with a polymer matrix under an electric field.

The narrow capillary efficiently dissipates heat, allowing higher voltages to be used without overheating.

Higher voltages result in faster separations and improved resolution.

Beckman Coulter launched the first commercial capillary electrophoresis instrument in 1989.

This technology paved the way for capillary-based Sanger sequencing systems.

Applied Biosystems launched the ABI Prism 310 in 1995, marking the beginning of modern Sanger sequencing.

The ABI Prism 310 replaced slab gels with a single capillary.

A sequencing run could be completed in under three hours, with read lengths approaching 600 base pairs.

The system also automated sample loading and reduced DNA input requirements through electrokinetic injection.

DNA fragments were separated by size, detected by a laser, and analyzed automatically by software that performed base calling.

Although fluorescently labeled ddNTPs were available, fluorescent primer labeling was initially preferred because it produced more uniform signal intensities.

This changed with the introduction of BigDye Terminator chemistry in 1997.

BigDye Terminators improved the balance of fluorescent signal intensity among dye-labeled ddNTPs, allowing all four termination reactions to be performed in a single tube.

This greatly simplified sequencing workflows.

The Human Genome Project continued to drive demand for faster and more automated sequencing technologies.

In 1998 Applied Biosystems launched the ABI Prism 3700, which contained 96 capillaries.

The ABI Prism 3700 played a major role in sequencing the human genome.

Each run processed 96 samples simultaneously, generated read lengths approaching 800 base pairs, and required minimal hands-on time.

Celera Genomics, led by Craig Venter, purchased hundreds of ABI Prism 3700 instruments and used them to compete directly with the publicly funded Human Genome Project.

Celera produced a draft human genome sequence in 2001, and the Human Genome Project published its own draft sequence the same year.

As an aside, the Human Genome Project did not sequence the DNA of a single individual. The  reference genome was assembled from DNA obtained from multiple anonymous donors. Much of this DNA came from blood samples (while  red blood cells lack nuclei and therefore contain no genomic DNA,  white blood cells contain nuclei and provide a rich source of genomic DNA). The resulting reference genome was therefore a composite sequence rather than the genome of any one person.


Modern Sanger sequencing remains widely used today.

Sanger sequencing typically achieves greater than 99.9% accuracy in high-quality reads, which is why it is often considered the benchmark against which other sequencing technologies are compared.

For small projects involving individual genes, plasmids, or a limited number of samples, Sanger sequencing is often faster and more cost-effective than next-generation sequencing (NGS).

However, Sanger sequencing has lower throughput and lower sensitivity for detecting rare variants, typically requiring variants to be present at roughly 15–20% frequency before they can be reliably detected.

In contrast, NGS can detect variants present at much lower frequencies and can generate billions of reads and terabases of sequence data in a single run.

This allows many whole human genomes to be sequenced simultaneously.

For validating variants, sequencing plasmids, or analysing a small number of genes or samples, Sanger sequencing remains one of the most practical and widely used sequencing methods available.


Friday, June 19, 2026

thumbnail

Complete RPMI-1640 Preparation: A Practical Laboratory Workflow

A Step-by-Step Guide to Plasmodium Culture Media

If you’ve spent any time in a malaria lab, you know that Plasmodium falciparum is a picky eater. 

The goal here is to create a stable, nutrient-rich environment that mimics human physiological conditions while accounting for the common pitfalls of lab life; like the frustrating tendency of buffers to outgas or essential amino acids to degrade.

The media used (RPMI-1640) was developed at Roswell Park Memorial Institute in the 1960s. The "1640" refers to the formulation number assigned during its development, reflecting the extensive empirical (trial-and-error, experiment-based) optimisation that produced the medium.

RPMI-1640 has a fascinating history because it emerged during the period when mammalian cell culture was transitioning from empirical media recipes to more rationally designed formulations. For those like me, wondering what empirical means, it’s approaches that were based on observation, experimentation, and trial-and-error rather than a complete theoretical understanding. 

Origin of RPMI-1640

RPMI stands for: Roswell Park Memorial Institute

The medium was developed at the Roswell Park Comprehensive Cancer Center in Buffalo, New York.

The principal developers were:

  • George E. Moore

  • Robert E. Gerner

  • Harold A. Franklin

during the 1960s.


The general components and amounts in 1L RPMI are: 

RPMI 1640: 10.44 g (The nutritional backbone, containing l-glutamine at a final conc. of 2mM). This is 5.22g if only making 500 mL

HEPES: 5.96 g (Your primary buffering agent). If making 500 mL, add 12.5 mL of 1M HEPES. If you buy the powdered RPMI, it likely comes with this already.


NaHCO3​: 58 mL of 3.6%  (final is 2g/L ; For pH stability and gas exchange). Sodium bicarbonate is notorious for outgassing (releasing CO2​), which can cause your pH to drift upward over time. To combat this, we add it just before use.


Hypoxanthine: 50 mg (200uM, Essential for parasite purine salvage)


Gentamicin: 20 ug/mL stock (Your antibiotic shield). For 1L you can add 20 mg of gentamicin


Ultrapure/sterile H2​O: 960 mL

1M NaOH: For dissolving the hypoxanthine


Conc. HCL and NaOH: For final pH adjustment





REFERENCES:

1.     Lopez-Perez, M., & Seidu, Z. (2022). Establishing and Maintaining In Vitro Cultures of Asexual Blood Stages of Plasmodium falciparum. Methods in Molecular Biology. 

2.     Maier, A. G., & Rug, M. (2013). In vitro culturing Plasmodium falciparum erythrocytic stages. Methods in Molecular Biology. 

3.     Trager, W., & Jensen, J. B. (1976). Human malaria parasites in continuous culture. Science, 193(4254), 673–675. 

4.     World Health Organization. (2023). World Malaria Report 2023. Geneva: WHO.


About

Search This Blog

Powered by Blogger.

What Is Next-Generation Sequencing? NGS vs Sanger Explained

The Human Genome Project generated the first reference sequence of the human genome, representing approximately 3.1–3.2 billion base pairs. ...

About Me

My photo
Adwoa Biotech Tools and Techniques Hub offers clear, practical explanations of essential molecular biology and biotechnology methods. Learn PCR primer design, cDNA synthesis, cloning strategies, nucleic acid purification, CRISPR delivery innovations, data analysis concepts, and everyday lab skills. Enjoyed the tutorial, connect with me on YouTube for video content on these topics: @adwoabiotech