5. DNA: The Substance of the Genes

The Human and Chimpanzee Genomes

Now that the genomes of both the human [Link] and the chimpanzee have been determined, it is possible to make more direct comparisons between the two species.

The results:

  • Their genomes are 98.8% identical (between any two humans — picked at random — the figure is closer to 99.5%).
  • Comparing over 7,000 genes that occur in both species (as well as in the mouse), it turns out that slightly over 1,500 of these have evolved quite differently in the two species.
    • In humans, genes for hearing, speech, olfaction — among others — have evolved rapidly since the two species diverged, while
    • in chimps, genes involved in formation of muscle and skeleton have evolved more rapidly.
  • In addition to the differences in the proteins these genes encode, differences between chimps and humans also involve changes in regulatory sequences — promoters and enhancers — of their genes. This is especially evident for genes encoding transcription factors. So even if a gene product is quite similar in the two species, how strongly, where, and whenits gene is expressed might be quite different.
    Follow this link to a discussion of the role of changes in gene regulatory regions in the evolution of animal form.
  • One gene product that is different between the two species is a protein designated myosin heavy chain16 (MYH16). In the 25 March 2004 issue of Nature, Stedman et alreport that
    • in chimpanzees (as well as pigmy chimps, gorillas, orangutans, and Old World monkeys), the MYH16 gene is expressed almost exclusively in their jaw muscles where it transcribed and translated to produce one form of myosin that is used in the thick filaments of their jaw muscle fibers.
    • In all humans, the MYH16 gene has a deletion producing a frameshift that results in a premature STOP codon (a nonsense mutation), and thus the myosin molecule is too short to make effective thick filaments.
    • The result in humans is jaw muscles that must rely on other myosins which produce fibers much thinner than those of the other hominoids and a much smaller muscle.
    • From these data, the authors speculate that
      • because smaller — and thus weaker — jaw muscles would exert less force on their bony attachment (the cranium),
      • this would allow more flexibility and the potential for greater growth of the cranium during childhood and thus allow for the larger brain found in humans and their Homo ancestors (e.g. Homo erectus).
    • The authors date the origin of the mutation to approximately 2.5 million years ago, when fossil evidence indicates that the line leading to the genus Homo — with brain cases growing from 0.75 to 1.4 liters — split from the Australopithecus line with its brain case of less than half a liter (Link to diagram).
  • While there are only small (~1%) coding differences in their genes, their genomes differ in other ways.

    are found in one species but not the other. Later work has revealed that of 510 chimpanzee sequences that are deleted in the human genome, only one occurs in the coding region of a gene. The others are found in introns or between genes, and at least some of these occur in gene-regulatory regions like enhancers.

  • Many single-nucleotide differences create different splicing sites so alternative splicing can produce substantial differences in the proteins of two species.
Welcome&Next Search

13 June 2011

Pyrosequencing

In laboratories around the world there is an intense desire to sequence more genomes.

  • those of a wide variety of organisms to aid in establishing evolutionary relationships;
  • those of pooled populations of microorganisms in, for examples, sea water, soil, the large intestine;
  • other humans to look for
    • genes that predispose to disease;
    • genetic patterns in various ethnic groups.

All of the sequenced genomes listed in Genome Sizes were determined using the dideoxy method invented by Frederick Sanger and described in the page DNA Sequencing.

But now a great effort is being expended to find ways to sequence DNA more rapidly (and more cheaply).

Several new methods are being developed and one is already commercially available (the Genome Sequencer 20 System). Its method is called pyrosequencing or sequencing by synthesis.

It works like this.

  • The DNA to be sequenced is broken up into fragments of ~100 base pairs and denatured to form single-stranded DNA (ssDNA).
  • Single ssDNA fragments are attached to microscopic beads, which are separated from each other.
  • The polymerase chain reaction (PCR) is run on each bead so that each becomes coated with ~ 10 million identical copies of that fragment.
  • The beads are placed singly into separate, microscopic wells (~200,000 of them).
  • Each well receives a cocktail of reagents:
    • DNA polymerase — for adding deoxyribonucleotides to the ssDNA
    • adenosine phosphosulfate (APS)
    • ATP sulfurylase — an enzyme that forms ATP from adenosine phosphosulfate (APS) and pyrophosphate (PPi).
    • luciferin
    • luciferase — an ATPase that catalyzes the conversion of luciferin to oxyluciferin with the liberation of light.

After a primer is annealed to the end of the ssDNA, synthesis is ready to begin.

As is always true of DNA synthesis [Link], incoming nucleotides are added to the 3′ end of the growing chain (left).

The nucleotides are supplied as four deoxynucleoside triphosphates. As each nucleotide is added, a molecule containing two phosphate groups — called pyrophosphate (PPi) is split off.

The sequencing run:

  • Each of the thousands of wells is flooded with one four deoxyribonucleotides
    • dTTP, dCTP, and dGTP but
    • instead of dATP (which would trigger the luciferin reaction), deoxyadenosine alpha-thiotriphosphate (dATPαS) is used instead. DNA polymerase ignores the difference and uses it whenever a T is encountered on the ssDNA template, but luciferase doesn’t recognize to it.
  • In any well where the complementary nucleotide is present at the 3′ end of the template, the nucleotide is added and pyrophosphate is liberated.
  • The amount of light is proportional to the number of that nucleotide added. So if, for example, the incoming nucleotide is dGTP, and there is a string of 3 Cs on the template, the light emitted will be 3 times brighter than if only one C is present.
  • A detector picks up the light (if any) from each well and the data are recorded.
  • Then each of the remaining 3 nucleotides are added in sequence.
  • Then the sequence of 4 additions is repeated until synthesis is complete.

The diagram on the left shows the type of data produced in a single well. The height of the peak of light production gives the number of additions that occurred when a particular nucleotide was added (bottom).

Computer software then displays the template sequence (top) for each of the thousands of different fragments sequenced.

With this technology, as many as 20 million base pairs of genome sequence can be learned in an instrument run of less than 6 hours.

 


Welcome&Next Search

7 March 2011

Tinggalkan komentar