Biochemistry Online: An Approach Based on Chemical Logic

Biochemistry Online


B:  More Complex Carbohydrates


Learning Goals/Objectives for Chapter 3B:  After class and this reading, students will be able to

  • state the differences between the homopolysaccharides glycogen, starch cellulose, and chitin and those with dissacharide repeat units (glycosoaminoglycans)
  • draw cartoon models of complex oligosaccharides such as peptidoglycans of bacterial cell walls, N and O linked glycoproteins, and proteoglycans showing the linkage of protein and CHO
  • describe the role of protein and cell surface CHO in binding and biological function
  • given diagrams of leukocyte and endothelial cells interactions, describe the role of selectins, selectin ligands, integrins and cellular adhesion molecules in immune cell/blood vessel interactions.

This chapter on complex carbohydrates (glycans/glycoconjugates) will review those features that are deemed especially important for a one semester course dealing with structure and function of biomolecules.

B1.  Polysaccharides

These contain many monosaccharides in glycosidic links, and may contain many branches. They serve as either structural components or energy storage molecules. The most common polysaccharides consisting of single monosaccharides are:

The basic chemical structures of these homopolymers are shown below.

Homopolysaccharides in Chair Conformations


Glc b (1-4) Glc link


It makes great chemical sense to store Glc residues as either glycogen or starch, which is one large molecule. A review of colligative properties would inform you that if all the Glc was stored as the monosaccharide, a great osmotic pressure difference would be found between the outside and inside of the cell.  It makes more sense to have glycogen exist as a many-branched linear polymer. When Glc is needed, it is cleaved one residue at a time from all the branches (at the nonreducing ends), producing a large amount of free Glc in a short time.

 Phi/Psi angles can also be described for the starch/glycogen main chain (around the acetal O) in a fashion comparable to that for proteins (around the alpha carbon).  The phi torsion angle describes rotation around the C1-O bond of the acetal link, while the psi angle describes rotation around the O-C4 bond of the same acetal link, with the glucopyranose ring considers as a rigid rotator (just as the 6 atoms in the planar peptide bond unit).  The most extended form of a Glcn polymer occurs when the glycosidic link is  b1->4 (as in cellulose), which forms linear chains.  The a 1->4 linked main chain of glycogen and starch causes the chain to turn and form a large helix, into which can fit iodine (or I3-), which turns starch purple.


angifdna.gif (35913 bytes)Jsmol:  Glycogen    |   Jsmol:  Amylose   |     Jsmol:  Amylose-2      |    Jsmol:  Amylopectin with I3-   |     Jsmol:  Amylopectin  |   Jsmol: cellulose

Many polysaccharides consist of repeating dissacharides units. Agarose, a polymer of a disaccharide repeat of (1-->3)-β-D-galactopyranose-(1 --> 4)-3,6-anhydro-α-L-galactopyranose, is often used for a gelable solid phase for electrophoresis of nucleic acid and as a component of chromatography beads.     A major class of polysaccharides with dissacharide repeats include the following glycosaminoglycans (GAGs), all which contain one amino sugar in the repeat and in which one or both of the sugars contain negatively charged sulfate or carboxyl groups. The extent and position of sulfation varies widely between and within GAGs.

hyaluronic acid, a polymer of Glucuronate (b 1->3) GlcNAc:  water soluble, in synovial fluid; backbone for attachment protein, and GAG's

dermatan sulfate, L-iduronate (b 1->3 ) GalNAc-4-sulfate

keratan sulfate, D-Gal (b 1->4) GlcNAc-6-sulfate

chondrotin sulfate, D-glucuronate (b 1->3) GalNAc-4 or 6-sulfate

heparin - D-glucuronate-2-sulfate (a 1->4) GlcNSulfo-6-sulfate

GAGs are found in the vitreous humor of the eye and synovial fluid of joints, and in connective tissue like tendons, cartilage, etc, as well as skin. They are found in the extracellular matrix and are often covalently attached to proteins to form proteoglycans.

Figure: Glycosaminoglycans


angifdna.gif (35913 bytes)Jmol:  Heparin  

A New visual nomenclature for glycobiology

A new symbolic nomenclature for carbohydrates in which monosaccharides are denoted by specific colored geometric shapes has been proposed by the Consortium for Functional Glycomics (2005). 

Figure:  CHO symbolic nomenclature

This nomenclature has recently been updated in Appendix 1B of Essentials of Glycobiology, 3rd Edition (Glycobiology 25(12): 1323�1324, 2015. doi: 10.1093/glycob/cwv091 (PMID 26543186)


B2.  Cell Walls

In contrast to eukaryotic cells, bacteria cells have a cell wall in addition to a lipid bilayer membrane. These are essentially carbohydrate polymers which offer protection from exterior hypotonic condition and the high internal osmotic pressures, preventing swelling and bursting of the cells. The membrane consist of a peptidoglycan. Two types exists.

a. Gram positive bacteria- These bacterial can be stained with Gram stain. The wall consists of a GlcNAc (b 1->4) MurNAc repeat .  This is similar to the GlcNAc (b 1->4) GlcNAc homopolymer chitin, except that every other GlcNAc contains a lactate molecule covalently attached in an ether-linkage to the C3 hydroxyl to form the monomer N-Acetylmuramic acid.   A tetrapeptide (Ala-D-isoGlu-Lys-D-Ala) is attached in amide link to the carboxyl group of the lactate in MurNAc. The GlcNAc (b 1->4) MurNAc strands are covalently connected by a pentaglycine bridge through the epsilon amino group of the tetrapeptide Lys on one strand and the D-Ala of a tetrapeptide on another strand.



One final structure is found in Gram + membranes. Techioic acids are often attached to the C6 of MurNAc. Teichoic acid is a polymer of glycerol or ribitol to which alternative GlcNAc and D-Ala are linked to the middle C of the glycerol. Multiple glycerols are linked through phosphodiester bonds. These teichoic acids often make up 50% of the dry weight of the cell wall, and present a foreign (or antigenic) surface to infected hosts. These often serve as receptors for viruses that infect bacteria (called bacteriophages).

Figure: Teichoic Acid

Teichoic Acid

Figure:  Gram Positive Bacterial Cell Wall 


Jmol:  Updated  Peptidoglycan glycosyl transferase    Jmol14 (Java) |  JSMol  (HTML5)   

B. Gram negative bacteria-

These bacterial can NOT be stained with Gram stain. The wall consists of the same structure as in Gram positive bacteria, but the GlcNAc (b 1->4) MurNAc strands are covalently connected through a direct amide bond between the epsilon amino group of the tetrapeptide Lys on one strand and the D-Ala of a tetrapeptide on another strand. (i.e. no pentaGly spacer). In addition, Gram negative bacterial don't have teichoic acid polymers. Rather they have a second, outer lipid bilayer. The cell wall is sandwiched between the inner and outer bilayers. The space between the lipid bilayers is called the periplasmic space. A hydrophobic protein covalently attaches (through an amide link from a protein Lys) to the cell wall at the last amino acid in the tetrapeptide unit of the wall (actually diaminopimelic acid which replaces about 10% of the D-Ala in the cell wall). The N-terminal of the hydrophobic proteins attaches to the outer lipid membrane  through a Ser.  The outer membrane is coated with a lipopolysaccharide (LPS) of varying composition. The LPS determines the antigenicity of the bacteria. The different LPS are called the O-antigens.

Figure:  A detailed view of LPS


B3.  N-linked Glycoproteins

Many proteins, especially those destined for secretion or insertion into membranes, are post-translationally modified by attachment of carbohydrates. They are usually attached through either Asn or Ser side chains. Carbohydrate modifications on the protein appear to be involved in recognition of other binding molecules, prevention of aggregation during protein folding, protection from proteolysis, and increases half-life of the proteins. In contrast to a protein sequence which is determined by a DNA template, sugars are attached to proteins by enzymes which recognize appropriate sites on proteins and attach the sugars. Since there are many sugars which contain many functional groups that can serve as potential attachment sites, the structures of the oligosaccharides attached to proteins are enormously varied, complex, and hence "information rich" compared to linear or folded polymers like DNA and proteins.

a. N-linked Glycoproteins

These contain CHOs attached through either a GlcNAc or GalNAc to an Asn in a X-Asn-X-Thr sequence of the protein. There are three types of N-linked glycoproteins, high mannose, complex, and hybrid.  They all contain the same core oligosaccharide - (Man)3(GlcNAc)2 attached to Asn.

Figure: N-linked Glycoproteins

 N linked glycoprotein

Figure:  N-linked high mannose glycoproteins

Figure:  N-linked complex glycoproteins

Figure:  N-linked hybrid glycoproteins

Note that in the hybrid oligosaccharide, one terminus contains Gal(b-1,4)GlcNAc.  However, in all other mammals except man, apes, and old world monkey, an additional Gal is often connected in an a-1,3 link to the Gal to give a terminus of:  Gal(a-1,3)Gal(b-1,4)GlcNAc.  These animals have an additional enzyme, a-1,3 Gal transferase.  Bacteria also have this enzyme and since we have been exposed to this link through bacterial infection, we mount an immune response against it.  Why is this important?  Pig hearts turn out to be similar to human hearts, so they might be good candidates for transplantation into humans (xenotransplants).  However, the Gal-a-1,3-Gal link is recognized as foreign, and we mount a significant immune response against it.  Several biotech firms are trying to delete the pig a-1,3 Gal transferase which would prevent the addition of the terminal Gal, and make them good donors for transplanted hearts.

   angifdna.gif (35913 bytes)Jmol:  N-linked Glycoprotein  

Influenza and the Avian Flu

The influenza virus is a simple yet deadly virus (shown below) .  It interacts with human cells through a surface protein, hemagglutinin (HA). 

credit:  � Paul Digard, Dept Pathology, University of Cambridge

The virus binds to host cells through interaction of HA with cell surface carbohydrates.  Once bound the virus internalizes, ultimately leading to release of the RNA genome of the virus into the host cell. 

Animation:  Influenza entering cell

The hemagglutin protein is the most abundant protein on viral surface (as surmised by antibody formation).  15 avian and mammalian variants have been identified (based on antibody studies).  Only 3 adapt to humans in last 100 yr, giving pandemic strains H1 (1918), H2 (957) and H3 (1968).  Three recent avian variants (H5, H7, and H9) jump directly to humans recently but have low human to human transmissibility.

The influenza hemagglutinin protein has the following characteristics:

Jmol:  Hemagglutinin antigen | Another View

Hemagluttinin bind to sialic acid (Sia), which is covalently attached to many cell membrane glycoproteins.   The sialic acid  is usually connected through an a(2,3) or a(2,6) link to galactose on N-linked glyocproteins.   The subtypes found in avian (and equine) influenza isolates bind preferentially to Sia (a2,3) Gal which predominates in avian GI tract where viruses replicate.  Human influenza isolates prefer Sia a(2,6) Gal.  Human virus of H1, H2, and H3 subtype (cause 1918, 1957, and 1968 pandemics) recognize Sia a(2,6) Gal, major form in human respiratory tract.  The swine influenza HA bind to Sia a(2,6) Gal and some Sia (a2,3) Gal both of which found in swine.   

Sia a(2,6) Gal  (Human)

Sia a(2,3) Gal (Avian and some Swine)

Sialic 26 Gal 

Sialic 23 Gal 

(made with Sweet, with an OH, not AcNH on sialic acid on C5)

(made with Sweet, with an OH, not AcNH on sialic acid on C5)

Structures from:

The present avian flu (H5N1) is deadly but lacks human to human transmissibility at the moment.  Why?  One reason is that it appears to bind deep in the lungs and is not released easily on coughing or sneezing.  It appears that cell surface glycoproteins deeper in the respiratory tract have Sia (a2,3) Gal which accounts for this pathology.

>The virus, before it leaves the cell, forms a bud on the intracellular side of the cell with the HA and NA in the cell membrane of the host cell.  The virus in this state would not leave the cell since its HA molecules would interact with sialic acid residues in the host cell membrane, holding the virus in the membrane.  Neuraminidase hydrolyzes sialic acid from cell surface glycoproteins, allowing the virus to complete the budding process and be released from the cell as new viruses. The drugs Oseltamivir (Tamiflu) and zanamivir (Relenza) bind to and inhibit neuraminidase, whose activity is necessary for viral release from infected cells. Tamiflu appears to work against N1 of the present H5N1 avian influenza viruses.  Governments across the world are stock piling this drug in case of a pandemic caused by the avian virus jumping directly to humans and becoming transmissible from human to human.

Jmol: Updated  Oseltamivir: Neuramindase N1 complex    Jmol14 (Java) |  JSMol  (HTML5)    


B4.  O-linked Glycoproteins

The CHOs are usually attached from a Gal (b 1-3) GalNAc to a Ser or Thr of a protein.

Figure: O-linked Glycoproteins

O linked glycoprotein

The blood group antigens (CHOs on cells attached to either proteins or lipids) are examples . The sugars shown as chairs in the figure below are the blood group antigens (in contrast to structures found in many texts).  They are attached to a core heterosaccharide (shown as red elipse below) which is connected to either a membrane glycoprotein or glycolipid.

Figure: Blood Group Antigens

Blood Group Antigens

B5.  Proteoglycans

Some proteins are so modified with CHOs that they contain more CHOs than amino acids. Proteins  linked to glycosoaminoglycans are together called proteoglycans (PGs). The structures of a few proteoglycans are known. The GAGs are O-linked to the protein, typically to a Ser of a Ser-Gly dipeptide often repeated in the protein. Some of the proteoglycans also contained N-linked oligosaccharide groups.

PGs can be soluble and are found in the extracellular matrix, or as integral membrane proteins. Given the diversity of sugars and the varying extent of sulfation, the CHO part of PGs provide an incredible variety of binding structures at or near to the cell surface. One PG, syndecan, binds through its intracellular domain to the internal cytoskeleton of the cell, while interacting with another protein - fibronectin - in the extracelluar matirx. Fibronectin also binds other molecules which can regulate cellular growth and other interactions. PGs act like glue in connecting the extracellular and intracellular functions of the cell. Most proteins bind PGs through a PG binding motif of BBXB or BBBXXB where B is a basic amino acid. Some proteins bind to specific sequences in specific GAGs. For instance, antithrombin 3, an inhibitor of blood clotting, binds specifically to heparin, which enhances its interaction with the clotting protein thrombin.

B6.  Eukaroytic Cell Membranes

We have studied lipids, proteins, and carbohydrates. Although phospholipid can spontaneously form biliayers, the actual structure of biological membranes is made much more complicated through addition of protein and carbohydrate substituents to the membrane.  Soluble proteins can be made to insert into bilayers by addition of nonpolar attachments. Several examples of such attachments include:

Figure:  Biological Membranes:  Simple to Complex

Figure:  A cool view of a membrane surface


B7.  Role of Cell Surface Carbohydrates

Cell surface carbohydrates present information-rich binding sites for other molecules and act as "receptors" for biological agents as diverse as viruses, bacteria, toxins, and other cells.  This is illustrated well by studying the properties of circulating immune cells.  The cells must often pass through the walls of capillaries as they hone in on a site of infection.  (Cancer cells do this as well as they escape the boundaries of the organ in which they developed and pass through the blood vessels and into new tissues in the process of of forming metastases.)  Immune cells must first bind to endothelial cells (a monolayer of cells that line the lumen of the blood vessels) before they can pass through the vessel walls.  Proteins called selectins our found on cells that can pass through vessels and on endothelial cells. There are 3 types:

  1. L-selectins: found on leukocytes ("white" blood cells that are circulating immune cells)

  2. P-selectins: found on activated platelets (which can aggregate to form a type of blood clot) and activated endothelial cells.  Activation occurs during the inflammatory response which can lead to the quick movement of pre-formed selectins stored within the cytoplasm to the membrane.  In addition, their expression can be induced.

  3. E-selectins: found on activated endothelial cells only after the cells have been induced to form them by certain immune hormones called cytokines releases by immune cells during an inflammatory response. 

These selectins are transmembrane proteins with an extracellular CHO binding domain, an EGF-like (epidermal growth factor like) domain, varying numbers of C (complement regulatory) domains, and a transmembrane domain.  The extracellular CHO binding domain is found in  proteins in all organisms.  Proteins that bind carbohydrate motifs are called lectins

Lectins and CHO ligands

Lectin Family/Lectin Abbreviation Ligand(s)


Concanavilin A ConA Mana1-OCH3
Griffonia simplicifolia lectin 4 GS4 Lewis b (Leb) tetrasaccharide
Wheat germ agglutinin WGA Ner5Ac(a2->3)Gal(b1->4)GlcGlcNAc(b1->4)GlcNAc
Ricin   Gal(b1->4)Glc


Galectin-1   Gal(b1->4)Glc
Mannose-binding protein MBP-A High Mannose Octasaccahride
Influenza Virus hemagglutinin HA Neu5Ac(a2->6)Gal(b1->4)Glc
Polyoma virus protein 1 VP1 Neu5Ac(a2->3)Gal(b1->4)Glc


Enterotoxin LT Gal
Cholera toxin CT GM1 pentasaccharide

In animals, lectins facilitate cell-cell interactions by forming multiple, but weak interactions between the protein and many sugars on the ligand to which it binds. 

The selectins are also part of a class of molecules called adhesion molecules.  As mentioned for the selectins, adhesion molecules contain

The selectins recognize Ser-linked CHO residues (a tetrasaccharide containing  sialic acid, galactose, GalNAc and fucose)  displayed on transmembrane glycoproteins called selectin ligands. L selectins bind to endothelial cell ligands while P and E selectins bind to ligands on leukocytes.  These interactions slow the leukocyte down as it rolls along the surface of the endothelial cells.  These interactions involve protein-CHO binding.

This initial binding mediated by selectin-CHO interactions activate the expression of another adhesion molecule on the leukocyte, integrin, a heterodimer with an a and b chain.  These cause strong leukocyte-endothelial cell interactions, leading to ultimate movement of the leukocytes  through the vessel wall.   Other classes of adhesion molecules (in addition to selectins and integrins) are cadherins (calcium-dependent adhesion molecules), and the immunoglobulin-like superfamily (ICAM1, ICAM2, VCAM).  VCAM (Vascular Adhesion Molecule) binds the integrin expressed on activated lympocytes, leading to passage of the lympocyte from the lumen of the vessel into the tissues.  Integrins appear to bind proteins in the extracellular matrix through  RGD (Arg-Gly-Asp) and also through LDV (Leu-Asp-Val) motifs on the proteins, including  fibronectin (RGD), thrombospondin (RGD & LDV), fibrinogen (RGD & LDV), van Willebrand Factor (RGD),  vitronectin (RGD).  They also bind other matrix proteins with an "alpha domain" including collagen and laminin.    Integrin/Adhesion molecule interations involve protein/protein interactions.

Genbacev et al. have recently shown that a fertilized egg (in the blastocyst stage which is ready for implantation in the uterine cell wall) express L-selectin which allows a low affinity (rolling-type) interaction of the fertiized egg with the uterine epithelial cells.  These cells expressed the CHO ligands on their surface which bind to the L-selectin on the blastocyst.  The CHO ligands are only transiently expressed on the surface of the epithelial cells of the uterus, presumably only when the uterus is primed for implantation.  After the initial interaction of the blastocyst and epithelial cells, further expression of integrins on the blastocyst surface might result.  Problems in any of these molecular steps could result in infertility. 

Figure:  Endothelial Cell/Leukocyte Interactions:  Selectins, Integrins, and ICAMs

An interesting experiment was recently done by Davis et al. that showed the importance of protein modification (like glycoslyslation) to binding and biological function.  Post-translational modifications represent one of natures way to change protein function.  The researchers were able to chemically modify surface features of a protein to produce new functionalities.    They did so by using mutagenesis to change surface amino acids to Cys or replacing Mets with nonnatural amino acid analogs that contain azide or alkyne groups.  These modified groups could then direct the location of chemical modifying reagents (such as sugars) to these sites.   The researchers studied a pair of proteins involved in inflammation, P-selectin, which binds a transmembrane protein P-selectin-glycoprotein ligand-1, that requires two post-translational changes to bind to P-selectin.  They picked a protein completely unrelated to PSGL-1, and selectively modified it using this approach so it contain a glycosylated and a sulfated side chain.  The unrelated protein bound to P-selectin. 

Selectins:  L-selectin  |  P-selectin  |  E-selectinSelectin Ligands

Integrins at a glance

Inner Life of Cell:  from Harvard (wait few moment to load) with narration

Integrins: Great Source of Information!

  Jmol:  Updated  P-Selectin Lectin/EGF Domains (IG1Q)    Jmol14 (Java) |  JSMol  (HTML5)    

Receptor for Sialic Acids

Lectins that recognize sialic acids, especially members of the Siglec family (sialic acid-recognizing Ig-superfamily lectins) turn out to be important players in our propensity for disease.  As we previously discussed, humans lack a hydroxlase gene necessary for the hydroxlation of Neu5Ac to Neur5Gc which is found in chimps who possess the enzyme.  Chimp's immune systems seems to confer protection from acquiring simian version of AIDS, cirrhosis, and other diseases which humans acquire when they are infected with the human versions of the HIV virus, hepatitis B or C, or other viruses.  These disease and others associated with overactive T cells (rheumatoid arthritis, asthma, type-I diabetes) are not common in chimps.    It turns out that there is a link between the type of sialic acid and the expresson of  siglics that influences the difference on our disease propensity.  Varki et al have shown that chimps and gorillas show much higher levels of expression of Siglecs on T cells, which are critical regulatory and effector cells in the immune system. When siglecs on T cells are activated, T-cell responses are down regulated.  Although HIV virus ultimately kills T helper cells, the virus initially activates them on infection, leading to their proliferation and production of a larger number of cells for the virus to infect. 

Influenza virus that has caused some of the greatest pandemics in world history also binds to sialic acid on host cells, through a viral binding protein called hemagglutinin.  On binding, conformational changes activate a neuraminidase activity of another viral protein, allowing cleavage of the sialic acid glycosidic bond, and subsequent entry of the virus into the cell.

B8.  General Links and References

CHO Web Sites

Sweet:   a program for constructing 3D models (unfornuately using Java) of saccharides from their sequences using standard nomenclature.

Textbook mistakes in Biochemistry:  Glycogen


  1. Sander I. van Kasteren, Holger B. Kramer, Henrik H. Jensen, Sandra J. Campbell, Joanna Kirkpatrick, Neil J. Oldham, Daniel C. Anthony, Benjamin G. Davis. Expanding the diversity of chemical protein modification allows post-translational mimicry.  Nature, 446, 1105 (2007).

  2. Dzung H. Nguyen, Nancy Hurtado-Ziola, Pascal Gagneux, and Ajit Varki. Loss of Siglec expression on T lymphocytes during human evolution.  PNAS May 16, 2006 vol. 103 no. 20 7765-7770 .
  3. Cohen, J. Differences in Immune Cell "Brakes" May Explain Chimp-Human Split on AIDS.  Science 312 672 (2006)
  4. Borman, Stu. Carbohydrate Advances.  C&E News. August 6, pg 41 (2005).

  5. Genbacev, O. , et al. Trophoblast L-Selectin-Mediated Adhesion at the Maternal-Fetal Interface.  Science. 299, pg 405 (2003) (Review:  Vol 299, pg 355, 2003)
  6. Schfield et al. Synthetic GPI as a candidate anti-toxic vaccine in a model of malaria.  Nature. 418, pg 785 (2002)
  7. Samuelson et al. Making Membranes in Bacteria (moving proteins to the correct location).  Nature. 406. pg 575, 637 (2000)
  8. Seeberger et al. Sugars Join the Automation rush (sold phase synthesis of oligosaccharides). Science. 291, pg 805 (2001)
  9. Peters et al. Fusion needs more than SNARES (how membranes fuse).  Nature 409, pg 567 (2001)
  10. Saxon and Bertozzi. Cell Surface Engineering by a modified Staudinger reaction.  Science. 287. pg 2007 (2000)
  11. Humprhies et al, Forsberg et al.  Mast cell Heparin (role).  Nature.  400, pg 714, 769, 773 (1999)
  12. Mel�ndez-Hevia et al.  Glycogen Structure: an Evolutionary View", Technological and Medical Implications of Metabolic Control Analysis (ed. A. Cornish-Bowden and M. L. C�rdenas), Kluwer Academic Publishers, Dordrecht.  pp. 319--326 (2000)
  13. Helenius et al. Translocation of lipid-linked oligosaccharides across the ER membrane requires Rft1 protein.  Nature. 415. pg 382, 447 (2002)


Return to Biochemistry Online Table of Contents


Creative Commons License
Biochemistry Online by Henry Jakubowski is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.