Concepts of Biology

Introduction to Cell & Molecular Biology (BIOL121) - Dr. S.G. Saupe (ssaupe@csbsju.edu); Biology Department, College of St. Benedict/St. John's University, Collegeville, MN 56321

A Primer on Molecular Genetics - Or, Life Has a Plan

I. Fundamental Questions
Biologists have long been interested in two fundamental questions: (a) what determines how an organism looks and behaves? and (b) why does an individual look like its parents?

The answer to both of these questions suggests that: (a) organisms have genetic information, a plan, or blueprints for constructing a new individual; and (b) there must be a mechanism for transmitting the information, blueprints, from parent to offspring.

Let's quickly review some terms in light of these observations: genotype - refers to the specific genetic plan, the blueprints. phenotype - refers to the outward expression of the genetic info (i.e., what the organism looks like or how it behaves).

II. Criteria for the Blueprints
Although early biologists didn't know what the blueprints were, they realized that the genetic information must meet several criteria:

be able to store information (enough to code for all of the traits/features of a complex organism);
account for variability/diversity among individuals. Thus it must have the potential for storing lots of information
transmit the message (inheritance);
be able to be copied exactly (replication); and
account for mutations (for evolutionary changes).

So, exactly what is the plan? We'll answer this question somewhat historically....

III. Location of the Blueprints
In order to identify the nature of the genetic blueprint, we need to determine exactly where in the cell it is located. Hammerling's (1930's) transplant experiments with Acetabularia showed that the nucleus was required for regeneration of the cap and that the nucleus released temporary chemical instructions. Steward's work with carrots and Gurdon's work with the African clawed toad, Xenopus, showed that a single cell, with a single nucleus, contained all the genetic instructions necessary for an entire new individual. Conclusion: in eukaryotic cells, the genetic information is contained in the nucleus.

IV. What is the molecule of inheritance? Protein or DNA?
During the first half of this century there were two main candidates for the genetic information: protein and DNA. Both occurred in the nucleus and both were large molecules with the potential to store information. The predominant paradigm was that proteins, with their greater variation and complexity, were the most likely candidates for the genetic blueprints. DNA was thought to be too small and too uniform to be able to code for all of the instructions necessary for a complex individual.

A. Griffith (1928) & Sugar-Coated Microbes
Frederick Griffith worked with virulent (disease-causing) and avirulent (non-disease causing) strains of pneumonia. The virulent, or smooth (S) strain, is "sugar-coated". In other words, it has a polysaccharide (carbohydrate or sugar) capsule that gives the colonies growing in culture a smooth appearance. This strain kills experimental mice because it doesn't activate the immune system. The non-virulent strain lacks a polysaccharide capsule, has a rough (R) appearance in culture, is attached by the immune so does not kill experimental mice. He performed the following experiments:

virulent strain + mouse � mouse dies � live virulent strain was isolated from the mouse
avirulent strain + mouse � mouse lives
heat kill the virulent strain + mouse � mouse lives
heat kill the virulent strain + live avirulent + mouse � mouse dies. Yikes - what happened? In addition, live virulent strain could be isolated from the mouse.

Conclusions: The virulent strain somehow converted the avirulent strain into a virulent one. Thus there must be passage of instructions, called a transforming factor, to permanently convert the avirulent into the virulent strain. The nature of the instructions (transforming factor) were unknown.

B. Avery et al. (1944)
By the early 1940's it was known that just an extract (grind the cells up) of a virulent strain could transform an avirulent strain into a virulent one. Avery and colleagues chemically identified the transforming factor by selecting destroying different macromolecules in an extract of a virulent strain and observing if they killed the mouse. The experiment was basically:

cell extract + RNAase (RNA digesting enzyme) + mouse � mouse dies
cell extract + DNAase (DNA digesting enzyme) + mouse � mouse lives
cell extract + protease (protein digesting enzyme) + mouse � mouse dies
cell extract + lipase (lipid digensting enzyme) + mouse � mouse dies

Conclusion: since digesting (removing) only the DNA results in preventing transformation, then DNA must be the hereditary instructions. Although it sounds logical and obvious, many still refused to believe these conclusions (biologists are pig-headed, too) because:

bacteria were not considered "typical" or representative models for all organisms;
the bacteria were considered too "crippled" and not behaving "normally";
there were traces of protein left in the DNA samples;
the paper was rather difficult to understand (i.e., poorly written with an obscure title);
published in a medical journal;
experiments were hard to replicate; and
Avery was too cautious and modest.

C. DNA/Protein Content During the Cell Cycle
One assumption biologists have always made about cell division is that just before a cell divides it would have to make a copy of the genetic instructions to pass on to the daughter cells (i.e., replication). It was discovered that the [DNA], but not [protein], of cells doubles prior to cell division. Hence, the ploidy level of the cell was correlated with the DNA content but not the protein. Conclusion: the blueprint is made of DNA.

D. Hershey/Chase (1952)
They worked with bacteriophages, which are viruses that attack bacteria. It was well known that: (a) these viruses contained only DNA and protein; (b) when these viruses infected a cell, the coat remained on the cell surface and genetic instructions are injected into the cell; (c) the host cell uses the viral instructions to make more virus; and (d) the host cell splits open releasing the new virus. So, which was injected - the DNA or protein?

They grew E. coli bacteria and the bacteriophage T2 in a growth medium enriched with either radioactive sulfur (³⁵S) or phosphorus (³²P). These were used as a marker for proteins (which contain sulfur) or nucleic acids (which contain lots of phosphorus), respectively. Then, they could see whether the radioactive material was inserted into the infected cell, or whether it remained on the bacterial cell surface. A brief outline of the experiment:

Preliminary:

E. coli + ³⁵S medium + T2 � T2 with ³⁵S labeled proteins

E. coli + ³²P medium + T2 � T2 with ³²P labeled DNA

Experiment:

Let ³⁵S-labeled T2 or ³²P-labeled T2 infect non-radioactive E. coli cells.

After an incubation period, homogenize the cells to separate the viral coat from the infected cell.

Centrifuge the mixture. The bacterial cells form a pellet at the bottom of the tube and the virus shells will remain in the supernatant because they are too light to form a pellet.

The ³⁵S was recovered in the supernatant and ³²P was recovered in the pellet

Conclusions:
    Since the ³⁵S was recovered in the supernatant with the virus, the viral coat must be made of protein. Since the ³²P was recovered in the pellet, the injected instructions must be made of DNA.   Hershey/Chase (1952) - "We infer that protein has no function in phage replication and that DNA has some function". This was the final and conclusive evidence that DNA is the hereditary information although even Hershey was still a little doubtful ("My own guess is that DNA will not prove to be a unique determiner of genetic specificity;" 1953).

E. Mutagens
Mutagenic agents were known to interact with DNA, not protein.

F. Conclusion
These studies convinced biologists that the genetic information was DNA. But, what did the molecule look like?

V. Structure of DNA
What is the structure of DNA? This was another difficult question to answer. After a very competitive race, Watson and Crick finally pieced together all the parts of the puzzle in 1953 and won a Nobel prize for their efforts. Note - they didn't do any experimental work. Rather, they worked with models trying to make sense of the data that were available to them. The pieces:

Miescher - first isolated DNA from sperm and pus cells (1868). These studies showed the genetic information was acidic and had an approximate chemical formula C₂₉ H₃₅ N₁₁ O₁₈ P₃. The molecule was rich in phosphorus. Later studies confirmed it was a nucleic acid - a polymer constructed from nucleotides.
Isolating DNA showed that it was fibrous and stringy. Thus it must be a long, filamentous molecule.
DNA is a nucleic acid.
DNA is a polymer of nucleotides. In turn, a nucleotide is comprised of: (1) five-carbon sugar (ribose - in RNA or deoxyribose - in DNA); (2) a phosphate group; and (3) a nitrogenous base. There are two kinds of nitrogenous bases: purines (2 rings) - adenine, guanine; and pyrimidines (1 ring) - cytosine, thymine, uracil.

    For convenience, biochemists refer to each carbon atom in both the sugar and nitrogenous bases by number. To distinguish between the two, the numbers of the carbons in the sugar are indicated by a prime (') symbol.   Thus, the carbons in the sugars are numbered from 1' to 5'.

    The three components of the nucleotides are linked like this: phosphate - 5'sugar1' - nitrogenous base

    To make DNA: the phosphate of one nucleotide is linked via a covalent bond to the 3' carbon of deoxyribose of the next nucleotide, and so on....
DNA has a linear backbone of alternating sugar and phosphates with the nitrogenous bases that stick out 90 degrees from the chain. The phosphate of one nucleotide is linked to the sugar of the next. This comes from the x-ray crystallographic data from Rosalind Franklin and Maurice Wilkins.
Chargaff observed that: (1) the content of A, T, G, and C between individuals of each species (2) the amount of A = T and C = G. Conclusion: adenine and thymine are complementary and pair with one another; guanine and cytosine are complementary.
DNA is a uniform diameter (more x-ray data), about 2 nm.
Model Building Inferences: (1) The DNA molecule is double stranded - held together by hydrogen bonds. There are two H-bonds between adenine and thymine and three between cytosine and guanine; and (2) the two strands are antiparallel (run in opposite directions). One end is the 3' side, the other is 5'
DNA is a helix. The helix makes 1 turn every 3.4 nanometer. There are 10 nucleotide pairs for each turn and thus each is 0.34 nm apart. This comes from Franklin's data.
Putting it all together - overview of the structure of the DNA molecule (see pix in text, and provided in class). Conclusion: DNA is a double-stranded molecule that looks like a ladder that was sawed in half and then put back together with velcro. Each half of the ladder is made of a rail with half-rungs sticking out perpendicularly. The rails of made of alternating sugar and phosphates and the rungs are nitrogenous bases. The rails run in opposite directions (anti-parallel). Velcro represents the hydrogen bonds holding the rungs together.

VI. Checking the Model
Remember our criteria for the DNA. The structure of DNA fit the model beautifully since it could:

Store information
Information is stored in the sequence of nucleotides. Thus, a gene is simply a segment of DNA that has a particular sequence of nucleotides.
Transmit the message
Although it was well known that DNA duplicated prior cell division, the exact mechanism by which the information could be read wasn't known (we call it transcription and translation).
Be able to be copied exactly (replication)
Even Watson and Crick noted that if each half of the molecule made the other, or complementary half, it would allow DNA to exactly replicate itself. ("It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism.") Clever lads.

Account for variability/diversity
The number of combinations of nucleotides is infinite. For example, how many ways can you arrange 2 objects (A & B) in 3 positions?

position 1: A A A B B B A B

position 2: A A B B B A B A

position 3: A B B B A A A B

This can be mathematically expressed as n^x where n is the number of objects and x is the number of position. Thus, for the example above, the number of possible ways to arrange the two objects (A, B) in two different ways is 2³ = 8. Now, consider a typical gene that is 300 nucleotides long. There are 4³⁰⁰ possible ways to arrange the nucleotides! That's a lot of potential information storage.

Account for mutations to allow for evolutionary change
It is easy to imagine that a mutation is simply a change in the sequence of nucleotides.

VII. Replication - the process by which DNA makes an exact copy of itself.

A. Where does it occur?
In the nucleus (at least in eukaryotes)

B. When does it occur?
During interphase (specifically S)

C. Why does it occur?
To provide genetic information for each of the daughter cells produced during mitosis/meiosis (or fission in bacteria)

D. How does it occur?
There were three major hypotheses for how DNA replicated:

Conservative Model - each molecule serves as a template to make another molecule such that you wind up with the original molecule and a brand new one;
Semi-conservative model - each each strand (or half of the DNA molecule, one half of the "ladder") serves as a template to make the other half (complementary strand). Thus, each new molecule is made up of one original strand and one new one; and
Dispersive Model - the DNA breaks up and makes hybrid molecules each with old and new DNA interspersed.

Based on the structure, the most likely model was semi-conservative (even Watson & Crick speculated this from the structure). So, how did biologists demonstrate this to be true?

Meselsohn/Stahl (1958) Experiment
    They demonstrated that DNA replication occurs by a semi-conservative method. Brief overview of the experiment: Grow E. coli in medium enriched with heavy (¹⁵N) nitrogen. Then transfer to regular medium medium (¹⁴N) and allow to grow for one generation (or more). Grind up the cells, extract the DNA and centrifuge in a cesium chloride gradient to separate the DNA molecules based on density. Thus, an individual can potentially have three possible types of DNA: (a) "heavy" - both strands radioactively labeled; (b) "light" - neither strand labeled; and (c) "semi-heavy" - one strand is labeled, the other is not.

     Cells grown in ¹⁵N containing medium all have heavy DNA. After one generation in regular medium, all the daughter cells have semi-heavy DNA. After several generations, some cells have light DNA and others have semi-heavy. Conclusion: replication is semi-conservative.

    Detail provided on diagram in class.

E. How Does it Occur? The Mechanism of Replication
DNA is made in replication "factories" that are comprised of a variety of enzymes (see below) and proteins. In short, the DNA is essentially threaded into the factory complex and is duplicated. This process is obviously rather complicated but involves:

The DNA is threaded into the replication factory complex.
Among other things the replication factory contains a large complex of enzymes such as DNA polymerase, ligase, primase, and helicase. In addition, there are an assortment of proteins that are required for the process.
The DNA must be unwound and the two strands are separated.
Enzymes (helicases) in the factory unwind the DNA helix at specific initiation sites, called origins of replication. These sites are determined the specific sequence of nucleotides in the DNA. There are many such sites in eukaryotic DNA, only one in a bacterial chromosome. The result is a "replication bubble" at each origin of replication. The DNA unwinds in both directions - called the replication fork;
The unwound & unpaired DNA must be stabilized.
Unpaired DNA is "unstable" and wants to rejoin with its complementary half. To help to stabilize the replication bubble additional proteins (single-strand binding proteins) attach to the DNA.
Complementary nucleotides are matched with the unpaired nucleotides in strand.
DNA polymerase is the primary enzyme responsible for replicating DNA. Adenine is matched to thymine, cytosine to guanine; and vice versa.
DNA polymerase can only add a new nucleotide to the end of a pre-existing strand of DNA or RNA.
A small piece of RNA, called a primer, is first matched with the DNA at the origin of replication by the enzyme primase. Primase occurs in an aggregate of proteins called a primosome (remember all of this is happening in the factory).
DNA polymerase can only add new nucleotides to the 3' end of the primer or the elongating strand of DNA (i.e., to the sugar part).
Thus, DNA is always made in the 5' direction toward the 3' direction. This is no problem with one strand (leading strand) of the DNA. It is made in one long continuous piece. However, the other strand, the lagging strand, which is anti-parallel, is made in short segments (called Okazaki fragments), again in the 5' to 3' direction.
The RNA primer pieces are snipped out by other enzymes (ligase) and the gaps are filled and the chains sealed.

F. Fixing Mistakes
Just like occur in any factory line, some mistakes are made as the DNA is replicated. In fact, DNA polymerase makes about on mistake every 10⁶ nucleotides (about one in a million). Although Detroit would be happy if this was the mistake rate for one of their factories - it is unacceptable for life. Fortunately, unlike mistakes made during the assembly of an automobile, the mistakes in making DNA are fixed before it leaves the factory:

DNA polymerase proofreads the DNA as it goes and wrong bases are removed and replaced.
any mistakes missed in proofreading are usually caught by final quality control inspectors - mismatched bases are replaced (mismatch repair) or damaged nucleotides are snipped out (excision repair).

The net result is that the mistake rate is very low - only one about 1 base in every billion nucleotides!

VII. PCR - Copying DNA in a Test Tube (not on exam)

PCR = polymerase chain reaction
technique to magnify (increase) small concentrations of DNA.
DNA polymerase obtained from a thermophilic methanogen; enzymes can tolerate heating (don't denature) which is necessary to 'melt' or separate the two strands of DNA
summary: mix sample containing DNA to be magnified with nucleotides, DNA polymerase and primer → heat mixture (90 C) → DNA 'melts' (strands separate) → temperature reduced (55 C) → DNA polymerase makes copy of DNA using primer and nucleotides → heat again to separate strands → repeat process (doubling the amount of DNA each time)
common forensic technique (i.e., CSI)
Kerry Mullis won Nobel Prize for developing technique

VIII. Sequencing DNA (not on exam)

Human Genome Project
Complete nucleotide sequences now known for several species, including humans
DNA synthesis, in vitro, requires: a template strand, a supply of dNTP's (nucleotides like adenine, guanine, cytosine and thymine), a primer, and DNA polymerase
The sequencing technique also incorporates ddNTP - dideoxynucleotides. These are similar to the normal nucleotides (dNTP) EXCEPT the ddNTP's are missing a hydroxyl group on the 3' carbon. Thus, if DNA polymerase uses one of these for DNA synthesis instead of the normal dNTP's, no additional nucleotides can be added because there is no free 3' hydroxyl group to add the next nucleotide.
To distinguish the four ddNTPs' - each is bonded to a different fluorescent dye.
Sequencing technique: single strand of DNA serves as a template → mixed with primer, DNA polymerase, 'normal' nucleotides (dNTP), and dideoxynucleotides (ddNTP) → DNA polymerase, starting with the primer, begins to match complementary nucleotides on the template strand with - at random - either dNTP or ddNTP nucleotides → if dNTP is used the chain continues to grow, if ddNTP is chain growth stops → ultimately all chains are terminated with a ddNTP newly made strands separated from template DNA → strands separated based on size using gel electrophoresis → since each strand terminates with a ddNTP a laser is used to determine the dye by the ddNTP results in sequence of original template strand
Check out the Human Genome Project video on-line for a description of the techniques used to sequence the human genome.