Restriction Enzymes

Restriction enzymes are DNA-cutting enzymes found in bacteria (and harvested from them for use). Because they cut within the molecule, they are often called restriction endonucleases.

In order to be able to sequence DNA, it is first necessary to cut it into smaller fragments. Many DNA-digesting enzymes (like those in your pancreatic fluid) can do this, but most of them are no use for sequence work because they cut each molecule randomly. This produces a heterogeneous collection of fragments of varying sizes. What is needed is a way to cleave the DNA molecule at a few precisely-located sites so that a small set of homogeneous fragments are produced. The tools for this are the restriction endonucleases. The rarer the site it recognizes, the smaller the number of pieces produced by a given restriction endonuclease.

A restriction enzyme recognizes and cuts DNA only at a particular sequence of nucleotides. For example, the bacterium Hemophilus aegypticus produces an enzyme named HaeIIIthat cuts DNA wherever it encounters the sequence

5’GGCC3′
3’CCGG5′

The cut is made between the adjacent G and C. This particular sequence occurs at 11 places in the circular DNA molecule of the virus φX174. Thus treatment of this DNA with the enzyme produces 11 fragments, each with a precise length and nucleotide sequence. These fragments can be separated from one another and the sequence of each determined.

Link to page describing DNA sequencing.

HaeIII and AluI cut straight across the double helix producing “blunt” ends. However, many restriction enzymes cut in an offset fashion. The ends of the cut have an overhanging piece of single-stranded DNA. These are called “sticky ends” because they are able to form base pairs with any DNA molecule that contains the complementary sticky end. Any other source of DNA treated with the same enzyme will produce such molecules.

Mixed together, these molecules can join with each other by the base pairing between their sticky ends. The union can be made permanent by another enzyme, a DNA ligase, that forms covalent bonds along the backbone of each strand. The result is a molecule of recombinant DNA (rDNA).

The ability to produce recombinant DNA molecules has not only revolutionized the study of genetics, but has laid the foundation for much of the biotechnology industry. The availability of human insulin (for diabetics), human factor VIII (for males with hemophilia A), and other proteins used in human therapy all were made possible by recombinant DNA.

Link to discussion of recombinant DNA.

Welcome&Next Search

12 March 2011

DNA Sequencing

DNA sequencing is the determination of the precise sequence of nucleotides in a sample of DNA.

The most popular method for doing this is called the dideoxy method or Sanger method (named after its inventor, Frederick Sanger, who was awarded the 1980 Nobel prize in chemistry [his second] for this achievment).

DNA is synthesized from four deoxynucleotide triphosphates. The top formula shows one of them: deoxythymidine triphosphate (dTTP). Each new nucleotide is added to the 3′ -OH group of the last nucleotide added.

Link to discussion of DNA synthesis.

The dideoxy method gets its name from the critical role played by synthetic nucleotides that lack the -OH at the 3′ carbon atom (red arrow). A dideoxynucleotide (dideoxythymidine triphosphate — ddTTP — is the one shown here) can be added to the growing DNA strand but when it is, chain elongation stops because there is no 3′ -OH for the next nucleotide to be attached to. For this reason, the dideoxy method is also called the chain termination method.

The bottom formula shows the structure of azidothymidine (AZT), a drug used to treat AIDS. AZT (which is also called zidovudine) is taken up by cells where it is converted into the triphosphate. The reverse transcriptase of the human immunodeficiency virus (HIV) prefers AZT triphosphate to the normal nucleotide (dTTP). Because AZT has no 3′ -OH group, DNA synthesis by reverse transcriptase halts when AZT triphosphate is incorporated in the growing DNA strand. Fortunately, the DNA polymerases of the host cell prefer dTTP, so side effects from the drug are not so severe as might have been predicted.

The Procedure

The DNA to be sequenced is prepared as a single strand.

This template DNA is supplied with

a mixture of all four normal(deoxy) nucleotides in ample quantities
- dATP
- dGTP
- dCTP
- dTTP
a mixture of all four dideoxynucleotides, each present in limiting quantities and each labeled with a “tag” that fluoresces a different color:
- ddATP
- ddGTP
- ddCTP
- ddTTP
DNA polymerase I

Because all four normal nucleotides are present, chain elongation proceeds normally until, by chance, DNA polymerase inserts a dideoxy nucleotide (shown as colored letters) instead of the normal deoxynucleotide (shown as vertical lines). If the ratio of normal nucleotide to the dideoxy versions is high enough, some DNA strands will succeed in adding several hundred nucleotides before insertion of the dideoxy version halts the process.

At the end of the incubation period, the fragments are separated by length from longest to shortest. The resolution is so good that a difference of one nucleotide is enough to separate that strand from the next shorter and next longer strand. Each of the four dideoxynucleotides fluoresces a different color when illuminated by a laser beam and an automatic scanner provides a printout of the sequence.

If you wish to see a representative example of a DNA sequence (455 nucleotides of the lysU gene of E. coli) which was generated by an automated sequencing device, LINK HERE. (The file size is 172K.) (The image is courtesy of Pharmacia Biotech Inc., Piscataway, NJ.)

External Link

Animation of the procedure

Please let me know by e-mail if you find a broken link in my pages.)

Welcome&Next Search

2 February 2011

Genome Sizes

The genome of an organism is the complete set of genes specifying how its phenotype will develop (under a certain set of environmental conditions). In this sense, then, diploid organisms (like ourselves) contain two genomes, one inherited from our mother, the other from our father.

The table below presents a selection of representative genome sizes from the rapidly-growing list of organisms whose genomes have been sequenced.

Table of Genome Sizes (haploid)
	Base pairs	Genes	Notes
φX174	5,386	11	virus of E. coli
Human mitochondrion	16,569	37
Epstein-Barr virus (EBV)	172,282	80	causes mononucleosis
Nanoarchaeum equitans	490,885	552	This parasitic member of the Archaea has the smallest genome of a true organism yet found.
nucleomorph of Guillardia theta	551,264	511	all that remains of the nuclear genome of a red alga (a eukaryote) engulfed long ago by another eukaryote
Mycoplasma genitalium	580,073	485	two of the smallest true organisms
Mycoplasma pneumoniae	816,394	680	two of the smallest true organisms
Chlamydia trachomatis	1,042,519	936	this bacterium causes the most common sexually-transmitted disease (STD) in the U.S.
Rickettsia prowazekii	1,111,523	834	bacterium that causes epidemic typhus
Treponema pallidum	1,138,011	1,039	bacterium that causes syphilis
Mimivirus	1,181,404	1,262	A virus (of an amoeba) with a genome larger than the six cellular organisms above
Pelagibacter ubique	1,308,759	1,354	smallest genome yet found in a free-living organism (marine α-proteobacterium)
Borrelia burgdorferi	1.44 x 10⁶	1,738	bacterium that causes Lyme disease [Note]
Campylobacter jejuni	1,641,481	1,708	frequent cause of food poisoning
Helicobacter pylori	1,667,867	1,589	chief cause of stomach ulcers (not stress and diet)
Thermoplasma acidophilum	1,564,905	1,509	These unicellular microbes look like typical bacteria but their genes are so different from those of either bacteria or eukaryotes that they are classified in a third kingdom: Archaea.
Methanococcus jannaschii	1,664,970	1,783
Aeropyrum pernix	1,669,695	1,885
Methanobacterium thermoautotrophicum	1,751,377	2,008
Haemophilus influenzae	1,830,138	1,738	bacterium that causes middle ear infections
Streptococcus pneumoniae	2,160,837	2,236	the pneumococcus
Neisseria meningitidis	2,184,406	2,185	Group A; causes occasional epidemics of meningitis in less developed countries.
Neisseria meningitidis	2,272,351	2,221	Group B; the most frequent cause of meningitis in the U.S.
Encephalitozoon cuniculi	2,507,519	1,997	(plus 69 RNA genes); a parasitic eukaryote.
Propionibacterium acnes	2,560,265	2,333	causes acne
Listeria monocytogenes	2,944,528	2,926	2,853 of these encode proteins; the rest RNAs
Deinococcus radiodurans	3,284,156	3,187	on 2 chromosomes and 2 plasmids; bacterium noted for its resistance to radiation damage
Synechocystis	3,573,470	4,003	a marine cyanobacterium (“blue-green alga”)
Vibrio cholerae	4,033,460	3,890	in 2 chromosomes; causes cholera
Mycobacterium tuberculosis	4,411,532	3,959	causes tuberculosis
Mycobacterium leprae	3,268,203	1,604	causes leprosy
Bacillus subtilis	4,214,814	4,779	another bacterium
E. coli K-12	4,639,221	4,377	4,290 of these genes encode proteins; the rest RNAs
E. coli O157:H7	5.44 x 10⁶	5,416	strain that is pathogenic for humans; has 1,346 genes not found in E. coli K-12
Agrobacterium tumefaciens	4,674,062	5,419	Useful vector for making transgenic plants; shares many genes with Sinorhizobium meliloti
Salmonella enterica var Typhi	4,809,037	4,395	+ 2 plasmids with 372 active genes; causes typhoid fever
Salmonella enterica var Typhimurium	4,857,432	4,450	+ 1 plasmid with 102 active genes
Yersinia pestis	4,826,100	4,052	on 1 chromosome + 3 plasmids; causes plague
Schizosaccharomyces pombe	12,462,637	4,929	Fission yeast. A eukaryote with fewer genes than the four bacteria below.
Ralstonia solanacearum	5,810,922	5,129	soil bacterium pathogenic for many plants; 1681 of its genes on a huge plasmid
Pseudomonas aeruginosa	6.3 x 10⁶	5,570	Increasingly common cause of opportunistic infections in humans.
Streptomyces coelicolor	6,667,507	7,842	An actinomycete whose relatives provide us with many antibiotics
Sinorhizobium meliloti	6,691,694	6,204	The rhizobial symbiont of alfalfa. Genome consists of one chromosome and 2 large plasmids.
Saccharomyces cerevisiae	12,495,682	5,770	Budding yeast. A eukaryote.
Cyanidioschyzon merolae	16,520,305	5,331	A unicellular red alga.
Plasmodium falciparum	22,853,764	5,268	Plus 53 RNA genes. Causes the most dangerous form of malaria.
Thalassiosira pseudonana	34.5 x 10⁶	11,242	A diatom. Plus 144 chloroplast and 40 mitochondrial genes encoding proteins
Neurospora crassa	38,639,769	10,082	Plus 498 RNA genes.
Naegleria gruberi	41 x 10⁶	15,727	This free-living unicellular organism lives as both an amoeboid and a flagellated form. 4,133 of its genes are also found in other eukaryotes suggesting that they were present in the common ancestor of all eukaryotes. The great variety of functions encoded by these genes also suggests that the common ancestor of all eukaryotes was itself as complex as many of the present-day unicellular members.
Caenorhabditis elegans	100,258,171	21,733	The first metazoan to be sequenced.
Arabidopsis thaliana	115,409,949	~28,000	a flowering plant (angiosperm) See note.
Drosophila melanogaster	122,653,977	~17,000	the “fruit fly”
Anopheles gambiae	278,244,063	13,683	Mosquito vector of malaria.
Tetraodon nigroviridis (a pufferfish)	3.42 x 10⁸	27,918	Although Tetraodon seems to have more protein-encoding genes than we do, it has much less “junk” DNA so its total genome is about a tenth the size of ours.
Rice	3.9 x 10⁸	28,236
Sea urchin	8.14 x 10⁸	~23,300
Zebrafish	1.2 x 10⁹	15,761
Dogs	2.4 x 10⁹	19,300
Humans	3.3 x 10⁹	~21,000	[Link to more details.]
Mouse	3.4 x 10⁹	~23,000
Amphibians	10⁹–10¹¹	?
Psilotum nudum	2.5 x 10¹¹	?	Note

Note: The gene total for Borrelia burgdorferi is based on 853 genes on its single chromosome (of 910,724 base pairs) plus 430 genes on 11 of the 17 plasmids it contains.

Arabidopsis thaliana is a plant (in the mustard family) that has the smallest genome known in the plant kingdom and for this reason has become a favorite of plant molecular biologists.

Even though Psilotum nudum (sometimes called the “whisk fern”) is a far simpler plant than Arabidopsis (it has no true leaves, flowers, or fruit), it has 3000 times as much DNA. No one knows why, but 80% or more of it is repetitive DNA containing no genetic information. This is also the case for some amphibians, which contain 30 times as much DNA as we do but certainly are not 30 times as complex.

The total amount of DNA in the haploid genome is called its C value. The lack of a consistent relationship between the C value and the complexity of an organism (e.g., amphibians vs. mammals) is called the C value paradox.

How many genes does it take to make an organism?

The scientists at The Institute for Genomic Research (now known as the J. Craig Venter Institute) who determined the Mycoplasma genitalium sequence have followed this work by systematically destroying its genes (by mutating them with insertions) to see which ones are essential to life and which are dispensable. Of the 485 protein-encoding genes, they have concluded that only 381 of them are essential to life.

Welcome&Next Search

3 February 2011

Laman: 1 2 3 4 5 6 7 8 9 10 11

5. DNA: The Substance of the Genes

Restriction Enzymes

DNA Sequencing

The Procedure

Genome Sizes

How many genes does it take to make an organism?

Tinggalkan komentar Batalkan balasan

HALAMAN ISI

Tulisan Terakhir

Arsip

Kategori

Meta

5. DNA: The Substance of the Genes

Restriction Enzymes

DNA Sequencing

The Procedure

Genome Sizes

How many genes does it take to make an organism?

Bagikan ini:

Tinggalkan komentar Batalkan balasan

HALAMAN ISI

Tulisan Terakhir

Arsip

Kategori

Meta