Protein Structure and Function - An Overview
Pharmaceutical Biochemistry I
Instructor: Patrick M. Woster, Ph.D.
Reading Assignment: Berg, Chapter 2,3 and 7
Protein Structure and Function
Proteins play crucial roles in almost every biological process. They are responsible in one form or another for a
variety of physiological functions including:
Enzymatic catalysis - almost all biological reactions are enzyme catalyzed. Enzymes are known to
increase the rate of a biological reaction by a factor of 10 to the 6th power! There are several thousand
enzymes which have been identified to date.
Binding, transport and storage - small molecules are often carried by proteins in the physiological
setting (for example, the protein hemoglobin is responsible for the transport of oxygen to tissues). Many
drug molecules are partially bound to serum albumins in the plasma.
Molecular switching - conformational changes in response to pH or ligand binding can be used to
control cellular processes
Coordinated motion - muscle is mostly protein, and muscle contraction is mediated by the sliding
motion of two protein filaments, actin and myosin.
Structural support - skin and bone are strengthened by the protein collagen.
Immune protection - antibodies are protein structures that are responsible for reacting with specific
foreign substances in the body.
Generation and transmission of nerve impulses - some amino acids act as neurotransmitters, which
transmit electrical signals from one nerve cell to another. In addition, receptors for neurotransmitters,
drugs, etc. are protein in nature. An example of this is the acetylcholine receptor, which is a protein
structure that is embedded in postsynaptic neurons.
Control of growth and differentiation - proteins can be critical to the control of growth, cell
differentiation and expression of DNA. For example, repressor proteins may bind to specific segments
of DNA, preventing expression and thus the formation of the product of that DNA segment. Also, many
hormones and growth factors that regulate cell function, such as insulin or thyroid stimulating
hormone are proteins.
Like most biological macromolecules, proteins are made up of simple building blocks; in the case of proteins,
these building blocks are called amino acids. As shown below, the amino and carboxyl moieties in an amino
acid are alpha to one another; also located on the alpha carbon is an "R" group. The nature of this R-group
(called the side chain) determines the identity of a particular amino acid. There are a total of 20 amino acids
which are used to make up proteins (some modified or otherwise unusual amino acids exist that we will discuss
later in the course). In solution at physiological pH (7.4), amino acids undergo an acid-base reaction to form
zwitterions. In a zwitterion, the + and - charges cancel to give a molecule with a net charge of zero. However,
the pKa values for a typical amino acid (glycine for example) are 9.6 and 2.3 for the amino and carboxyl
groups, respectively. If the pH of an amino acid solution is lowered significantly from 7.4, a species results in
which the amine group has a positive charge, while the carboxyl is neutral. Likewise, If the pH is raised
from 7.4, a species results in which the amine group is neutral, while the carboxyl has a negative charge.
Thus, the ionization state of amino acids is pH dependent.
All amino acids except glycine (R = H) are chiral. Every amino acid in mammalian systems exists in the Lconfiguration, where "L" signifies that the amino acid in Fischer projection is similar to L-glyceraldehyde.
This description of stereochemistry is outdated, and is seldom used except in trivial names. However, all natural
amino acids are also in the S-configuration, which is determined by assigning priorities based on the CahnIngold-Prelog rules.
As was mentioned above, there are 20 amino acids which are used to make up proteins in mammalian
biological systems. The amino acids are amphipathic molecules, meaning that they contain both polar and nonpolar functional groups, and thus have a tendency to form interfaces between hydrophilic and hydrophobic
molecules. The properties of each amino acid are dictated by the side chain, which can vary in size, shape,
charge, reactivity and ability to hydrogen bond. The amino acids are grouped according to the properties of
their sidechains, as shown in the figure below. Each amino acid has a standard three letter abbreviation
which is used in lieu of a full structure, as seen in the figure.
The first six amino acids, glycine (GLY), alanine (ALA), leucine (LEU), isoleucine (ILE), proline (PRO)
and valine (VAL) are aliphatic in nature. Glycine and alanine are too small to have a hydrophobic effect in
proteins, but they are considered aliphatic amino acids. Proline is also aliphatic, and because of its cyclic
structure, it can often be found in the bend portion of a protein chain. Valine, leucine and isoleucine are
hydrophobic aliphatic, and although they can be found anywhere in the chain, they prefer to cluster in the
inside region of a protein, away from water. This effect causes a significant stabilization of the protein structure.
There are three aromatic amino acids, phenylalanine (PHE), tyrosine (TYR) and tryptophan (TRP). These
amino acids have sidechains which contain delocalized pi electrons that can interact with other pi systems in
biomolecules. In addition, the phenolic hydroxyl of TYR can ionize under physiological conditions, and thus
increase water solubility. Two of the amino acids are sulfur-containing, namely cysteine (CYS) and
methionine (MET). These amino acids have special properties that will be covered at a later time. Finally, there
are two hydroxyl-containing amino acids, serine (SER) and threonine (THR). These two amino acids have
sidechains which can hydrogen bond to water or to other groups on neighboring macromolecules.
Five of the 20 amino acids are considered hydrophilic, in that they are able to ionize at physiological pH. The
amino acids lysine (LYS), arginine (ARG) and histidine (HIS) are considered basic hydrophilic, since they
contain basic sidechain groups that will have a positive charge at pH 7.4. The amino acids aspartic acid (ASP)
and glutamic acid (GLU) are considered acidic hydrophilic, since they contain acidic sidechain groups that
will have a negative charge at pH 7.4. These two amino acids also have amide counterparts, asparagine
(ASN) and glutamine (GLN).
Note that 8 of the 20 amino acids have ionizable sidechains. Arginine, lysine and histidine can have a positive
charge, while aspartic acid and glutamic acid can possess a negative charge under physiological conditions. It
is also possible for serine, tyrosine and cysteine to ionize to a negatively charged species during certain
Protein chains are held together by peptide bonds, which are simply amide linkages between neighboring
amino acids. When two amino acids interact, an equilibrium is set up between unbound amino acids and a
species in which two amino acids are linked, called a dipeptide. Since this equilibrium favors the unlinked
forms of the amino acids, it is clear that formation of a peptide bond requires energy. When a few amino acids
become linked, the protein species is called an oligopeptide, and when many are linked, the species is called a
polypeptide. Polypeptides are generally between 50 and 2000 amino acids. Their molecular weights are
expressed in Daltons, where 1 Dalton is equal to 1 atomic mass unit (the weight of one hydrogen atom). 1000
Daltons is called a kilodalton (kD). Most proteins weigh in between 5500 and 220,000 Daltons.
Each peptide chain has two free ends, the amino terminus, which is always drawn on the left by convention,
and the carboxyl terminus, which is always drawn on the right. This convention extends to peptide chains
expressed using three letter abbreviations. Thus, the oligopeptide ALA-GLY-TRP-SER-GLU has an alanine at
the amino terminus, and a glutamic acid at the carboxyl terminus.
Amino acids in a protein are determined by the genetic code, wherein a three base sequence of nucleotides
called a codon calls for a specific amino acid to be added to the growing chain. The process of converting the
sequence of codons into a sequence of amino acids entails transcription (the conversion of a segment of DNA
into complimentary mRNA) and translation (the conversion of the mRNA code into protein). You will learn a
great deal more about protein synthesis later in the semester.
As shown below, amino acids can participate in reactions that occur after they are positioned in a peptide chain.
These reactions are called post-translational modifications, and can be of enormous biological significance.
One example of a post-translational modification is the crosslinking of two cysteines to form a new amino
acid, called cystine. This modification most often occurs in extracellular proteins, and can contribute to their
There are other post-translational modifications of biological significance, three of which are shown below. In
some proteins, acetylation of the amino terminus occurs. This modification greatly decreases protein
degradation, since many proteases require an amino terminus to act. In structural protein such as collagen,
hydroxylation of proline occurs to afford hydroxyproline (HPRO). Since hydroxyproline has a hydrogenbonding sidechain, it is used to lend additional strength to the collagen structure, and hence to tendons and other
like tissues. Finally, the amino acids serine, threonine and tyrosine can be phosphorylated within a protein
chain. This modification is often used by the cell to turn on or off a critical biological process.
In addition to the post-translational modifications mentioned above, some proteins are synthesized in inactive
forms called pro forms. For example, some enzymes are synthesized as inactive proenzymes, and are trimmed
by a peptidase to form the active enzyme. The portion of the enzyme chain that is cleaved is then hydrolyzed,
and the amino acids are reused.
A bit of history:
In 1953, Sanger performed a critical series of experiments in which he demonstrated several facets of protein
structure. His experiments showed that proteins have a unique amino acid sequence; all molecules of a given
protein are identical, and the sequence of each different protein is unique. He also showed for the first time that
all amino acids in mammalian proteins are in the S-configuration, that the peptide bond is an amide bond,
and that amino acids have alpha amino groups and alpha carboxyl groups. We now know that proteins are
made when a section of DNA is read (a process called called transcription) and a complimentary molecule
of RNA is formed. This RNA is then used to specifiy the structure of a given protein through a process called
translation. Thus, the sequence of a protein is encoded in DNA.
The sequence of a peptide is important for other reasons including these:
Knowledge of a peptide sequence can aid in the determination of the mechanism of action of the
protein. For example, binding areas of a protein often contain hydrogen-bonding amino acids such
Relationships between amino acids in a protein chain can help to dictate 3 dimensional structure. For
example, when two cysteins crosslink to form a cystine (as described above) a loop is formed in the
Variations in the amino acid sequence of certain proteins can cause disease. For example,
substitution of a VAL for a GLU at a certain residue of hemoglobin results in a mutant hemoglobin
called hemoglobin S. This defect, caused by a genetic mutation, results in the disease sickle cell anemia,
since hemoglobin S cannot carry oxygen as well as regular hemoglobin. Since hemoglobin has 574
amino acids and a molecular weight of 63 kD, one can conclude that very small variations in struture
can have a great effect on biological activity!
The peptide bond has unique characteristics which contribute to the overall structure of proteins. The peptide
bond itself is rigid, and thus is not free to rotate. This rigidity arises because the amide bond is involved in a
tautomerization that gives it considerable double bond character. The other bonds in a peptide ar not rigid, and
can freely rotate, giving the protein chain many degrees of rotational freedom. The amide bond, together with
the bonds on either side of it that connect to the alpha carbons, are called the backbone of the protein chain.
Proteins have a total of four levels of structure, as defined below:
Primary structure - this term refers to the amino acid sequence of a protein, including cystines that
are formed during crosslinking. Sequence can dictate three dimensional structure, since amino acid
residues need to be in a specific order to foster proper protein folding, and since disulfides must be
formed from properly positioned cysteines to afford an active protein. An example of primary structure
is the hypertensive octapeptide angiotensin II, which has the sequence ASP-ARG-VAL-TYR-ILE-HISPRO-PHE.
Secondary structure - this term refers to the arrangement of amino acids that are close together in a
chain. Examples of secondary structures are helices and pleated sheets. An alpha helix is a tightly
coiled, rodlike structure which has an average of 3.6 amino acids per turn. The helix is stabilized by
hydrogen bonding between the backbone carbonyl of one amino acid and the backbone NH of the amino
acid four residues away. All main chain amino and carboxyl groups are hydrogen bonded, and the R
groups stick out from the structure in a spiral arrangement. As seen in the table below, there are several
types of alpha helix that arise from the degree of hydrogen bonding in the helix.
Another type of secondary structure, the beta pleated sheet is composed of two or more straight chains that are
hydrogen bonded side by side. If the amino termini are on the same end of each chain, the sheet is termed
parallel, and if the chains run in the opposite direction (amino termini on opposite ends), the sheet is termed
antiparallel (see below left). All of the amides are hydrogen bonded except those on the outer strands. Pleated
sheets may be formed from a single chain if it contains a beta turn, which forms a hairpin loop structure. Often
a proline can be found in a beta turn, since it places a "kink" in the chain. When the beta sheet curves around
itself and the outer edges on either side hydrogen bond to one another, it forms a structure called a beta barrel,
which is a common structural motif in proteins.
Tertiary structure - tertiary structure refers to the arrangement of amino acids that are far apart in the
chain. Each protein ultimately folds into a three dimensional shape with a distinct inside and outside.
The interior of a protein molecule contains a preponderance of hydrophobic amino acids, which tend to
cluster and exclude water. The core is stabilized by Van der Waals forces and hydrophobic bonding. By
contrast, the exterior of a protein molecule is largely composed of hydrophilic amino acids, which are
charged or able to H-bond with water. This allows a protein to have greater water solubility. A protein
will spontaneously fold to preserve the relationships outlined above.
Quaternary structure - protein chains can associate with other chains to form dimers, trimers, and
other higher orders of oligomers. Generally they contain between 2 and 6 subunits which may be
chains with the same sequence (homodimers) or different chains (heterodimers).
Proteins can be associated with membranes, and in fact carry out almost every membrane function.
Interestingly, membrane proteins have special characteristics that allow them to exist in this lipid environment.
Proteins that sit on the inner or outer surface of the membrane are called extrinsic or peripheral, and have a
large percentage of hydrophobic amino acids in the portion of the molecule that is close to the hydrophobic
membrane structure. The amino acids on the outer portion of the protein (facing the aqueous environment of the
cytoplasm or extracellular fluid) are mostly hydrophilic, allowing the protein to be compatable with water.
Proteins can also traverse the membrane, and in this case they are called intrinsic or integral. The portion of
the protein that passes through the membrane is composed of hydrophobic amino acid residues, while the inner
and outer portions exposed to water are largely hydrophilic. Transmembrane proteins can move laterally in the
membrane, but cannot flip-flop.
Proteins are a unique class of biomolecules, in that they can recognize and interact with diverse substances. The
contain complimentary clefts and surfaces which are designed to bind to specific molecules. Often only a
single molecule or even a single stereoisomer can bind to a complimentary protein surface. Once this binding
takes place, a complex is formed. This induces a conformational change which may act as a signal within the
cell, or may serve to activate an enzyme.
Methods for Protein Isolation and Purification
There are a number of experimental procedures which may be used to characterize peptides and larger protein
molecules. Six of these methods are discussed below:
1. Enzymatic cleavage - A peptide chain may be cleaved at specific peptide bonds using enzymes
known as peptidases. One of the most common peptidases is trypsin, which cleaves a peptide chain on
the carboxyl side of a LYS or ARG. Thus the sequence PRO-HIS-ARG-GLY-GLY is cleaved to PROHIS-ARG and GLY-GLY. Another common peptidase is chymotrypsin, which cleaves the chain on the
carboxyl side of each aromatic amino acid (TRP, TYR, PHE). Thus the sequence GLN-SER-PHEASP-GLY-TYR-THR is cleaved to GLN-SER-PHE, ASP-GLY-TYR and THR.
2. Electrophoreisis. Electrophoreisis refers to the separation of proteins by causing them to move in
an electric field. This is usually done on a gel made of polyacrylamide. A current is passed through the
gel, and the proteins migrate from the cathode to the anode. In sodium dodecyl sulfate (SDS)
electrophoreisis, proteins are treated with the detergent SDS and mercaptoethanol to denature them
and disrupt disulfide bonds, and are then loaded onto the gel. When the electric field is passed through,
smaller peptides migrate fastest, as shown in the diagram below:
A second common electrophoreisis procedure is known as isoelectric focusing, because proteins
migrate until they reach electroneutrality. Consider a protein that has 50 ionizable sidechains, 25 that can
be positive and 25 that can be negative. The isoelectric point pI is the pH at which the number of
positive and negative charges equals zero. At this point, the net charge is zero. In isoelectric focusing, a
polyacrylamide gel is treated with ampholines, which set up a pH gradient across the length of the gel.
As shown above, each protein will "focus" at the point on the gel where the pH equals its isoelectric
point, at which time it stops moving. Since isoelectric focusing is non-denaturing, it can be used to
isolate active proteins in their native form.
3. The Edman Degradation. The Edman degradation refers to a reaction that is used to determine
the sequence of a given peptide. The amino terminus of the peptide is treated with phenyl
isothiocyanate, forming a complex, as shown below. Upon acid treatment, the terminal amino acid is
removed by cleavage of the first peptide bond, forming a phenylthiohydantoin. Note that the R group
of the phenylthiohydantoin is the same as the R group of the terminal amino acid. Thus, there are 20
phenylthiohydantoins that can form during the Edman degradation, one for each of the 20 amino acids.
Repeated cycling allows for each amino acid in the chain to be identified by isolating its
phenylthiohydantoin. This procedure is carried out rapidly and efficiently by an automated sequencer.
Peptides can also be synthesized by an automated process. These peptides are constructed on beads
made of polystyrene or some other solid support in a process known as solid phase synthesis. As
shown below the bead is reacted with the carboxyl end of an amino acid in which a protecting group
such as N-Boc is in place to keep the amine from reacting prematurely. Once the amino acid is attached
to the bead, the amino terminus is by treating with acid, and a peptide bond is formed with a second
protected amino acid. The coupling of these two amino acids is done in the presence of , which fosters
the formation of the amide.
The cycle of removal of the protecting group and addition of amino acids is continued until the desired
peptide has been formed, and then the peptide is released from the bead using HF.
4. Ion Exchange Chromatography. Ion exchange chromatography seperates proteins based on their
charge, as shown below. There are two methods, known as anion exchange (shown below) and cation
exchange. In anion exchange chromatography, a protein is added to a column packed with beads which
bear a positively charged group such as diethylaminoethyl. The negative charges on the protein
displace the counterion (chloride is shown) and stick to the bead. After washing the coulnm, the protein
is eluted using another negative ion. Sodium chloride in a concentration gradient is commonly used,
and the more negative charges on a protein, the better it sticks, and the more NaCl is needed to displace
it. A cation exchange column works the same, except that the charge on the bead is negative, and
proteins stick by their positively charged residues.
5. Affinity Chromatography. Affinity chromatography is used to isolate one particular protein from
a mixture, as shown in the figure below. An epoxysepharose gel is allowed to react with a ligand that
has an affinity for the protein of interest, and the protein mixture is then added to the column. Only
the protein that binds to the ligand will stick. After washing the column to remove the rest of the protein,
the protein of interest is eluted using a salt gradient.
6. Enzyme-Linked Immunosorbent Assay (ELISA). Enzyme-linked immunosorbent assay, or
ELISA, depends on the reaction of a predetermined protein with a specific antibody to form a complex.
This method is extremely sensitive, and can distinguish between two proteins that differ by only one
amino acid. A serum or blood sample is added to the specific antibody which has been bound to a
polymer support, and the first complex forms. A second antibody, specific for the protein of interest but
linked to an enzyme is then added, forming a complex that is bound to an active enzyme. The enzyme
carries out the conversion of a non-colored or non-fluorescent substrate to a colored or fluorescent
product, which is measured. The more color that is produced, the more of the protein of interest that is
present. This technique is the basis for many diagnostic tests, including pregnancy tests where human
chorionic gonadotropin is measured.
Specific Examples of Protein Structure and Function
1. The Renin-Angiotensin-Aldosterone System. The renin-angiotensin-aldosterone system is used by the
body to regulate blood pressure (see the figure below). In response to lowered blood pressure, the kidney
releases the protease renin, which cleaves the inactive, 14 amino acid peptide angiotensinogen to another
inactive peptide, the decapeptide angiotensin I. A second enzyme, angiotensin converting enzyme (ACE),
converts this decapeptide to its active form, the octapeptide angiotensin II. Angiotensin II is a potent
vasoconstrictor that is about 40 times more potent than norepinephrine at raising vascular pressure. In addition,
angiotensin II stimulates the release of aldosterone, a steroid hormone that causes the kidney to reabsorb
sodium and water, thus raising blood pressure by an osmotic effect. Angiotensin II is ultimately inactivated by a
third peptidase called angiotensinase, which renders the hormone inactive.
The renin-angiotensin-aldosterone system is of great importance in the development of a common disease
known as essential hypertension. When the renin-angiotensin-aldosterone system is overactive, the basal blood
pressure is elevated, putting increased stress on the cardiovascular system. A group of compounds have been
developed known as ACE inhibitors which are used quite effectively to treat hypertension. Since they prevent
the conversion of angiotensin I to angiotensin II, they prevent the elevation of blood pressure seen in essential
2. Oxytocin and Vasopressin. Oxytocin and vasopressin are two peptide hormones with very similar
structure, but with very different biological activities. Their primary sequences are shown below. Interestingly,
their structures only differ by one amino acid residue (the hydrophobic LEU number 8 in oxytocin is replaced
by a hydrophilic ARG residue in vasopressin). Oxytocin is a potent stimulator of uterine smooth muscle, and
also stimulates lactation. However, vasopressin, also know as antidiuretic hormone (ADH), has no effect on
uterine smooth muscle, but causes reabsorbtion of water by the kidney, thus increasing blood pressure.
3. Insulin and Glucagon. Insulin is an extremely important peptide hormone that is produced by the beta
cells of the Islet of Langerhans in the pancreas. It has 51 amino acids, three disulfide crosslinks, and is
comprised of two seperate chains, termed A and B. Insulin has a number of important effects on cells in the
1. Stimulation of glycolysis (glucose breakdown).
2. Stimulation of glycogen formation (a storage form for glucose).
3. Enhancement of the rate of fatty acid biosynthesis.
4. Stimulation of the entry of glucose into cells.
5. Overall reduction of blood glucose levels.
Insulin is not synthesized in active form, but is first made as a single inactive peptide chain called
preproinsulin (see the figure below). Preproinsulin has no crosslinks, and in addition to the A and B chain, has
two additional portions called the signal sequence and the connecting (C) peptide. The signal sequence
informs the cell that insulin is being made, and that the finished preproinsulin should be deposited outside the
cell. The C-peptide is necessary to allow preproinsulin to fold in the correct conformation to ultimately produce
active insulin. Preproinsulin is processed by a two step procedure; in the first step, the signal sequence is
cleaved by a peptidase, and two of the three crosslinks are formed to give a new but still inactive peptide called
proinsulin. A second peptidase then cleaves the C-peptide, and an internal disulfide forms to produce insulin.
Glucagon is a peptide hormone that is formed in the alpha cells of the Islets of Langerhans in the pancreas. It
is a single chain peptide consisting of 29 amino acid residues, and has effects which oppose insulin, including:
Down regulation of glycolysis.
Enhancement of the rate of glycogenolysis (glycogen breakdown).
Reduction in the rate of fatty acid synthesis.
Enhancement of blood glucose levels.
4. Hemoglobin. Hemoglobin A (HbA) is a tetrameric protein which consists of two alpha chains and two
beta chains, and comprises 98% of human hemoglobin A. There is a heme group and an oxygen binding site
on each subunit; therefore, each molecule of HbA can carry 4 molecules of oxygen. There are other forms of
human hemoglobin A, the most common being HbA2, which has two alpha chains and two delta chains, and
accounts for 2% of HbA.
Hemoglobin is an example of an allosteric protein, i.e. its function can be altered by the binding of some
external substance (called the effector) at a site on the molecule other than the active site (the allosteric site).
When an allosteric effector binds to a protein, it induces a conformational change which turns the function of
the protein either on (positive allosterism) or off (negative allosterism). In the case of hemoglobin, the
allosteric effector is 2,3-diphosphoglycerate (2,3-DPG), which causes hemoglobin to have 1/26th of its normal
affinity for oxygen. This is an important issue, since 2,3-DPG in the tissues triggers the release of oxygen at the
Hemoglobin also exhibits cooperativity, which is a phenomenon wherin the binding of one molecule to a
protein with more that one active site influences the ease of binding of subsequent molecules. Cooperativity can
be positive (the second molecule binds more easily), or negative (the second molecule binds less easily). In the
case of hemoglobin, the binding of oxygen to the four sites of hemoglobin is an example of positive
As shown in the figure below, hemoglobin can also exist in a glycosylated form known as HbA1C. HbA1C is
formed when the amino terminus of HbA reacts with glucose, first reversibly forming an aldimin or Schiff's
Base, and then undergoing an irreversible Amadori rearrangement to afford the ketamine form HbA1C. In
normal patients, HbA1C accounts for about 3-5% of HbA, but in diabetics who have elevated blood glucose for
extended periods, this number can reach 6 to 15%. Physicians can measure HbA1C, and are using it as a reliable
way to monitor how well diabetic patients are complying with their insulin therapy.
5. Collagen. Collagen is a connective tissue protein that is found in skin, bone, tendons, cartilage, the
cornea, etc.. It is quite insoluble in water, and is composed of two types of chain termed alpha-1 and alpha-2.
In the amino acid sequence of collagen, about every 3rd amino acid is a GLY residue, and there are many
prolines which are hydroxylated to form hydroxyproline (HPRO). LYS residues are also hydroxylated in
collagen to form HLYS. These additional sidechain OH residues allow for extra strength due to H-bonding, and
the GLY residues allow the protein to coil more tightly, since they fit on the inside of the helix. In a collagen
fiber, three of these helices are coiled together to form a rope-like structure called a superhelical coil. It is this
structure that gives collagen its great strength. Collagen structure can be disrupted in diseases such as scurvy,
which is a lack of ascorbic acid, a cofactor in the hydroxylation of proline. In addition, collagen structure is
disrupted in rheumatoid arthritis.
Return to the PSC 3110 Homepage
Different Levels of Protein Structure
The wide variety of 3-dimensional protein structures corresponds to the diversity of functions proteins fulfill.
Proteins fold in three dimensions. Protein structure is organized hierarchically from so-called primary structure
to quaternary structure. Higher-level structures are motifs and domains.
Above all the wide variety of conformations is due to the huge amount of different sequences of amino acid
residues. The primary structure is the sequence of residues in the polypedptide chain.
Secondary structure is a local regulary occuring structure in proteins and is mainly formed through hydrogen
bonds between backbone atoms. So-called random coils, loops or turns don't have a stable secondary structure.
There are two types of stable secondary structures: Alpha helices and beta-sheets (see Figure 3 and Figure 4).
Alpha-helices and beta-sheets are preferably located at the core of the protein, whereat loops prefer to reside in
Figure 3: An alpha helix:
The backbone is formed as a helix.
An ideal alpha helix consists
of 3.6 residues per complete turn.
The side chains stick out.
There are hydrogen bonds
between the carboxy group of amino acid n
and the amino group of another amino acid n+4 .
The mean phi angle is -62 degrees
and the mean psi angle is -41 degrees .
(see also section on Helical Wheels)
Figure 4: An antiparallel beta sheet.
Beta sheets are created,
when atoms of beta strands are hydrogen bound.
Beta sheets may consist of parallel strands,
antiparallel strands or out of a mixture
of parallel and antiparallel strands .
Tertiary structure describes the packing of alpha-helices, beta-sheets and random coils with respect to each
other on the level of one whole polypeptide chain. Figure 5 shows the tertiary structure of Chain B of Protein
Kinase C Interacting Protein.
Figure 5: Chain B of Protein Kinase C Interacting Protein.
Helices are visualized as ribbons and
extended strands of betasheets by broad arrows.
(the figure was obtained by using rasmol
and the PDB-file corresponding to PDB-ID 1AV5
stored at PDB, the Brookhaven Protein Data Bank)
Quaternary structure only exists, if there is more than one polypeptide chain present in a complex protein.
Then quaternary structure describes the spatial organization of the chains. Figure 6 shows both, Chain A and
Chain B of Protein Kinase C Interacting Protein forming the quaternary structure.
Figure 6: Quaternary structure of
Protein Kinase C Interacting Protein.
(the figure was obtained by using rasmol
and the PDB-file corresponding to PDB-ID 1AV5
stored at PDB, the Brookhaven Protein Data Bank)
Motifs and domains are combinations of secondary structures. Motifs only consist out of few secondary
structures. They may but need not have a function. A domain is more complex. It is usually defined as a
modular functional unit folding independently.
Bacteriology at UW- Madison
Ken Todar's Microbial World
University of Wisconsin - Madison
Chemical and Molecular Composition of Microbes
From atoms to elements to molecules to macromolecules to life
Chemistry is essential to the study of living things. Not to sound irreverent, but much of life is based on
chemical reactions. This article will address a few principles of chemistry and biochemistry to prepare you for
these topics which will inevitably come up in The Microbial World.
All matter in the Universe is composed of elements. It is the elements that are identified and described in the
Periodic Table of the Elements (Table 1), familiar to all beginning chemistry students. Elements are made up
of atoms which consist of a variety of subatomic particles, the most important of which in biology are the
negatively-charged electron (e-) and the positively-charged proton (H+). Each element has distinct properties
due to the distinct nature of its atom and the behavior of electrons, protons and other subatomic particles in its
The atom is the fundamental unit of an elements and cannot be broken down further without changing the
properties of the element. If an atom loses or gains one or more electrons, it will acquire an electrical charge.
Such atoms are referred to as ions. Thus, if a sodium (Na) atom were to lose an electron it would acquire a
positive charge and be symbolized Na+. If a chlorine (Cl) atom were to gain an electron it would symbolized Cl-.
Positively charged ions are called cations; negatively charged ions are anions.
Table 1. Image PeriodicTable.jpg. The Periodic Table of Elements. Given in the table are the distinct characteristics of the
atom that comprises the element: 1. atomic number of the element (in the upper right corner of the element symbol) is the
number of protons in the atom; atomic weight (below the symbol of the element) is derived from the combined weight of
electrons, neutrons and protons which make up the atom. The atomic weight of an element must be known to calculate the
molecular weight of a chemical compound that is formed when elements bond together together into molecules. The major
elements of living systems are C, H, O, N, S, and P. Minor elements are Na, Mg, K, Ca, Mn, Fe, Co, Ni, Cu and Zn.
A cell, the fundamental unit of life on Earth, is composed of organic matter, the exact definition of which will
be given below. Organic material is made up of a relatively small handful of elements, cells being composed of
over 97% carbon (C), oxygen (O), nitrogen (N), hydrogen (H), phosphorus (P) and sulfur (S).
Table 2. The major elements of bacteria
When two or more elements interact with one another and achieve stability, a chemical compound is formed.
The smallest part of the compound that retains the chemical properties of the compound is termed a molecule.
The atoms in a molecule are joined to one another by some sort of chemical bond. Thus, two atoms of oxygen
(O) joined together form O2, or molecular oxygen; two atoms of nitrogen (N) joined together form N2 (nitrogen
gas); carbon (C) bonded with two atoms of O forms CO2 (carbon dioxide), the predominant gases in earth's
atmosphere. Two atoms of hydrogen (H) joined to an atom of oxygen form a molecule of H2O or water, which is
the predominant liquid on the planet.
Table 3. Major types of chemical bonds in biological molecules