A GUIDE TO MODELING PROTEINS AND NUCLEIC ACIDS USING Q/C

This document gives generalized instructions in the use of Quanta/Charmm96, one of the most extensive programs for the modeling of biological macromolecules. Quanta is the trade name for the program to display molecular structures. Charmm stands for Chemistry at HARvard for Molecular Modeling. It is a molecular mechanics program designed to calculate energies for large molecules, developed by Martin Karplus at Harvard. You will use computer modeling:

to help you understand secondary, supersecondary,tertiary, and quaternary structures of proteins that we discuss in class or that illustrate specific examples of common structural or functional motifs;
to help you understand the relationship of protein structure to biological function, including binding and catalysis;
to complete several problems sets related to peptide and protein structure; and
for modeling of the protein you choose for your final modeling project and video.

The following instructions will allow you to get into Quanta/Charmm, pull up macromolecular structures, and manipulate them. Be patient. These instructions are not perfect. If you find them in error, or find an easier way to do things, let me know ASAP. The "less than profoundly useful" manuals for use of QC are located in the computer room. They are not to be removed from the room under severe penalty.

Getting into Q/C:

Click on APPLICATIONS and then Quanta4.1. Once you are logged into the program, a top menu bar appears. You could click on File at the far left to open a file within QC. Click OPEN a new file, click DIRECTORY UP until you locate the directory titled hjakubow/. Click that and then the directory ch331proteins/. All the files on the chart are available to you. Follow the necessary prompts until the structure appears. If you wish to save it, and if you wish to use this structure in PROTEIN DESIGN under the Quanta APPLICATIONS menu at some future time, you must save it to your own directory. I suggest strongly that if you wish to do extensive modeling with a given protein, that you create a subdirectory for that protein into which you save all the files associated with that protein. Otherwise, you will have great difficulty when you wish to delete irrelevant files at a later time. Instructions on how to create subdirectories and other simple UNIX commands are given in A GUIDE TO UNIX COMMANDS at the end of this document. After you open up the file, the structure should appear in the main window. Sometimes the structure appears slightly off screen or doesn't appear at all. Click Reset on the dials window (bottom right). This will solve the problem. The red dots which appear around the protein are water molecules. With this program you can changes the display of the molecule in many ways. Holding down the different mouse buttons, rotate and translate the image of the molecule. Hold down the middle mouse key and move the mouse around the viewing area. The structure rotates at the same speed and in the same direction as the mouse. If you press and hold the right button, and move the mouse to the left, the structure rotates on a clockwise direction (vice-versa for cc rotation.) If you hold down the shift key and the middle mouse button, the object translates in the xy axes. The dials window allows control of molecular position as well. When the protein appears, a new window appears as well, called molecule management, which will show the number of atoms, the active atoms (atoms which commands will affect), and the visible atoms (whether the atoms are displayed). If you toggle on visible on the screen, the molecule disappears, but the atoms are still active.

Rendering the Molecules

Many simple things can be done with the molecules using a program called Protein Design. Click on Applications on the top menu bar of QC, and then select Protein Design. Two new menus will now appear on the right hand side of the screen. They are labeled on the top bar as Protein Utilities and Protein Design. These menus appear in windows which overlap each other. You can move one of these windows to the front by clicking on the top bar of the window. Alternatively, you can drag these windows by clicking on one of the edges, holding down the mouse button, and moving the windows. It is easier to just click the top menu bars. Click Molecular Colors under the Protein Utilities menu. Select Secondary Structure, alpha carbon trace to show the main chain with color coding. Click legend to see the color coding scheme. Reclick legend to turn it off. Reselect Molecular Colors and experiment with different color coding, including coding for hydrophobicity. Click torsions talbes on the Protein Utility menu to see , , and angles. Select Protein Health and then phi/psi plot to display Ramachandran plots. Click File on Ramachandran plot top menu, then save as some filename. Close protein design. Select File on the main QC top menu, then

     xy graphs
          file
               export
                    inverted postscript
                         name the file (put it in you home
                         directory)

Printing instructions are given below.

Now go to the top menu bar of QC. Under the View Menu, click Roll. The image will continually rotate at a slow speed. Reclick to stop. To create a ribbon structure for the protein, click the Draw option on the top menu bar. Click Create Object, followed by Ribbon Structure, and then follow prompts. Your have now created a new object, the ribbon structure, which will be overlaid over the protein structure. If you want to delete the ribbon structure, click on Draw, then Manage Object, and follow the prompts to delete it, or delete it directly from the object management window. One of the best features of the program is the different ways the molecules can be rendered. The default system is just line representing the bonds. More sophisticated rendering can be done from the Draw item in the top menu of QC. Select Solid Models. You can then render the display as either ball and stick, van der Waals spheres, liquorice bonds, protein cartoon, or user defined. A pop-up window appears and a few seconds later, the newly rendered image. These can be rotated, but much more slowly. An even more beautiful display can be made by clicking Draw, followed by Ray Trace. The resulting space filling model can not be rotated. Select Raster for another view.

Checking the Health of the Molecules

Use Protein Health from Applications menu to detemine if imported PDB files are of high quality. Select Display Exceptions to show problems points in the structure. Select separately Buried Polar Atoms, Buried Hydrophilic, Exposed Hydrophobic Atoms, Holes, Close Contacts, and Chirality to monitor the quality of the structure. Select Main Chain Conformation to monitor of planairty of the peptide bond ( angle, the angle of the C-C-N-C system), which should be about 180. Select Sidechain to detect if strange rotamer angles are present. You can select List Exceptions to Textport to view information on the exceptions.

Rendering the Active Site

Many times, you would like to display just the atoms within a certain distance of the active site (or binding site). You can do this in two ways. You can change just the display of the molecule by selecting DRAW from the top menu bar, and then select Color followed by Selection Tools. Chose Proximity and follows instructions. These commands alters the display of the molecule, but you can not save this display for later recall.

A second method is to actually edit the molecule to eliminate all the remaining atoms permanently and then rename the file showing just the active site under a different name. To do this, select Edit from the top menu bar, and then select Active Atoms, followed by Selection Tools. First click Exclude, All, (all atoms turn blue) and then click Include. Next select Proximity Tools. Select All MSF's if it allows that possibility. Select the following sequentially: Graphical Sphere, Whole Residue, Set Radius Length to 12 (you might have to change this later to include all the structure you would like). A orange sphere will appear. All atoms in this sphere will be selected. Hence the sphere must be centered around the active site. You can do this in two ways. In the first you can click Set Object Origin, and then use the mouse to click an atom around which you wish to center the object. Alternatively, you can use the dials palette and click Move in X, Y, or Z to move the sphere. Next select Apply Selection. All the atoms within the sphere shold turn red. Then click Exit Proximity, followed by Finish. Only the atoms in the sphere will be displayed, and the Molecular Management Table window should show that the displayed structure has many fewer atoms than the starting pdb file. Now select Save As, and give it a new structure name to differentiate it from the starting file.

Note than under Proximity Tools, you can chose a graphical cylinder or slab instead of a sphere. Choose which ever appears to be most appropriate to your needs.

Printing and Exiting from Q/C:

Graphics generated on the screen can be printed on the Laser Printer in Ardolf and on the color HP Desk Jet.. The best method is to use the snapshot program from the SGI described previously in A GUIDE TO PRINTING AND RECORDING FROM THE SGI NETWORK. Remember to select preferences and change color definition to black and white to reverse the black background of the QC window. Some files can be saved directly as postscript files from Q/C. Instructions for saving a Ramachandran plot in inverse postscript form were given above. (Inverse postscript is the negative image of the original plot. The original plot had a black background, which takes forever to plot.) To plot on the Laser printer, type the following command in a Unix shell, which hopefully will be visible at the bottom of your screen: lp filename.ps To plot on the color desk jet, type the following command: lp -das233c filename.ps To plot a image of a molecule directly from Q/C do the following.. Select File from the top menu of QC, then select in succession Plot Molecule, Generate Molecular Graphics View Plot, Postscript Format, Translate as Color. You must save the plot in postscript format to print it (at least for now). Print these files as described above. A note of caution: I have not yet been very successful printing some of the more exotic images that can be created. Keep trying this and the other options available. You must close many of the menus and close any active proteins before you can exit Quanta. Please be aware that this is just an introduction through the features of QC and Protein Design. You will have to built on these features by looking through the book and discussing the program with others in the class and with me. These programs are quite easy to use, but take a lot of time to figure out. Good luck. More instructions will appear with time.

Detailed Use of Quanta/Charmm 1: Segmenting PDB files; Charmm Energy;
Charmm Minimization

These QC instructions will demonstrate how to obtain proteins and substrates information from the internet, how to separate the regions of a protein/substrate complex, how to manipulate the graphical displays of the various regions, and how to begin molecular modeling calculations for the protein. All these skills will be required for your final video presentation. The enzyme carbonic anhydrase will be the example used for this example.

Find the structure for carbonic anhydrase, complexed with bicarbonate (1hcb: Carbonic Anhydrase I (E.C.4.2.1.1) at the Protein Data Bank as described earlier Be sure to append .pdb after the filename! Enter QC. Under FILE, choose Import. Open the file you saved from the protein data base. Quanta should import the structure of the protein complex on to the screen. This entire complex should be saved as an .msf file. Do not WRITE OUT BONDTYPES when saving; the crystal structure does not provide any bond type information to Quanta.

Go to Edit on the top menu bar of Q/C and select SPLIT AND CLEAN. New windows will appear on the right hand side of the screen. Select the option SAVE TO SEPARATE MSF's. QC will automatically separate the file into separarate files including:

protein
nucleic acid
other
solvent

The other files might include metal ions, inhibitors, etc. Exit Split and Clean. A new molecular management window appears at the bottom right of the desktop. The multile-separate files are listed. Click on the YES under visible for each file to toggle them on and off. Each file can then be rendered separately.

[There is a more difficult way to separate the files as well . The following describes how to do it, using carbonic anhydrase as an example.] Under DRAW, go to Color>By Segment. Quanta will color the protein complex. You should be able to find four different regions: the enzyme itself, a surrounding cage of water molecules, the bicarbonate substrate, and a bound zinc cofactor. Segmenting the structure into different parts allows you to manipulate and display each of the segments at will, and allows you to render the molecules in anyway which suits your needs. You will need to separate these four regions into four different files. Under EDIT, go to Active Atoms>Selection Tools. The Active Atoms menu will appear at the right side of the screen. In this menu, first choose Pick Segment and then Residue Type. Choose the amino acid residues only! Do not choose BCT (bicarbonate), HOH (water), or ZN (zinc). Click on OK and then on Finish in the Active Atoms menu. The protein should now be the only thing appearing on the display. Under FILE, Save the protein with a new filename (remember not to choose WRITE OUT BONDTYPES). Now go to EDIT and choose Active Atoms>All Atoms. At this point, all the atoms in the structure (enzyme, water, bicarb and zinc) should reappear. If this happens, repeat the directions in the previous paragraph for BCT alone, then HOH alone, then ZN alone. If all the atoms do not reappear (and as for why they don't, your guess is as good as mine), a slightly different process may be used to separate the regions. Under FILE, Open the .msf filename of the original enzyme/substrate complex. REPLACE the current selection. The entire complex should reappear. Go back to EDIT and Active Atoms>Selection Tools. Again, Pick Segment and Residue type, and choose BCT only. Click on OK and then on Finish in the Active Atoms menu. The bicarbonate should now be the only thing appearing on the display. Under FILE, Save the bicarbonate with a new filename. Repeat this process for water and zinc. Once you have completed this process, you can graphically manipulate the segments individually. Make sure all four files for all four segments are opened and active.

Under APPLICATIONS, choose Protein Design. On the pop-up Protein Utilities menu to the right, choose Molecule Colors. In the dialog box which now appears on the screen, you can choose to display the protein by its alpha carbon trace only, or including its side chains, or several other options. The alpha carbon trace, for one, provides a good idea of the protein's secondary structure. After each selection, the program should generate a new line diagram of the protein. After you have finished in Protein Design, choose Finish from that menu (if you want all the atoms to reappear at this point you must COLOR ALL ATOMS in the Molecule Colors dialog box). Now go under DRAW to Solid Models>Selection Tools. (Quanta may crash here; there seems to be some problem in moving from Protein Design to the solid models. Just restart Quanta if this happens, and once Quanta is open again choose Restart Quanta under the FILE menu). The Solid Display menu and the Solid Schemes and Utilities menus will appear at the right. Decide what kind of model you want to generate and then choose that model under the Solid Schemes menu (for example, choose thin ribbon). Now you need to specify to which file you want that model applied. Using the simpler SPLIT AND CLEAN command, pick the separate pdb file that you would like to render. Alternatively, on the Solid Display menu Pick Segment and then Residue Type (for example, choose all the amino acid residues). Finish. The solid model should appear on the screen--be patient. You may generate as many solid models as you wish, and you can control whether or not a particular model is displayed by turning it on and off on the Solid Objects Management box which will appear. Remember, you can apply all of these segmentation and coloring techniques to just the active site region, which can be modeled as described earlier.

Using CHARMm

Charmm is the molecular mechanics program which is used to determine the energy of molecules and to minimize the energy of new structures. Make sure you have a console (i.e. dialog box) open at this point. CHARMm has two files for mathematically minimizing structures. The first kind, RTF (for Residue Topology File), can only be used for molecular structures that CHARMm recognizes. RTF files are available for amino acids (amino.rtf), water (tip3.rtf), and metal ions (ions.rtf). If the program finds amino acids in the structure, it will obtain parameters for energy determination from the amino.rtf file. This files is the default selection of Charmm, and includes in the calculation only polar H's. If you wish all amino acid hydrogens included, use the aminoh.rtf. We will not use this file because it adds an enormous number of atoms to the structure. Remember Hydrogen atoms do not appear in the crystal structure because they are not large enough to diffract x-rays. QC can add them to the protein, but all those additional H's add complexity to the equations. RTF files are not available for the substrates. CHARMm uses the second kind of file, the PSF, to minimize the energies of molecules it does not recognize. You must generate a PSF file only for the substrate.

First, the three regions of the enzyme complex that have RTFs will be minimized. Open up the zinc, water, and protein files, using the append command (not replace). Make sure the bicarbonate file is closed. Under CHARMm, (top menu bar) go to CHARMm Mode and then choose RTF OPTIONS. A dialog box will appear. Click on RTF files, and then add amino.rtf, (if you have a protein), dna.rtf (if you have DNA, tip3.rtf (if you have water) and ions.rtf (if you have metal ions). Then clcik OK. Then, on the Molecular Modeling menu (to right), choose CHARMm Energy. The console will give you a breakdown of the individual contributions to the molecule's energy. Now choose CHARMm Minimization. Minimize the energy until it stays at a fairly fixed value (may require several minimizations). Choose CHARMm Energy again. How does it compare? It is important to minimize the protein, crystallographic water, and the metals together, since each structure is constrained by the other. If the water was minimized independently, for example, it may adopt a different set of structures which were not H-bonded to the enzyme. Now open only the bicarbonate file. Go under EDIT to Molecular Editor. On the Edit Atoms menu (to right), Add All Hydrogens. Notice anything wrong with the structure? This occurs because hydrogens are too small to be elucidated by X-ray crystallography and are not included in the coordinates from the data bank. Quanta has no template for this molecule the way it does for amino acids, so it does not add the hydrogens correctly. Click on Delete Atoms and delete all the hydrogens except for one attached to an oxygen. Click on Add Bonds and then click on the carbon and one of the two oxygens lacking a hydrogen. A double bond should appear. Now open Editor options and AUTOMATICALLY ADJUST BOND LENGTHS, AUTOMATICALLY OPTIMIZE ANGLES, REASSIGN ATOM TYPES and REASSIGN ATOM CHARGES. Quit and save. When Quanta asks for the desired total charge on the molecule, change 0.000 to -1.000. Under CHARMm, go to CHARMm mode and choose PSF. On the Molecular Modeling menu (to right), choose CHARMm energy. You will probably get a list of missing parameters. Close the list. CHARMm should then do its thing and the console will give you a breakdown of the individual contributions to the molecule's energy.

Now choose CHARMm minimization. Minimize the energy until it stays at a fairly fixed value (may require several minimizations). Important: if, during these steps, the console tells you that CHARMm is out of logical swap space, the system may be too overloaded to process the calculation. You may have to try it again when there are fewer users on the system. Now open up all four regions: bicarbonate, water, enzyme and zinc. Make sure CHARMm is set for PSF. On the Molecular Modeling menu (to right), choose CHARMm energy, and examine the energy breakdown. Then Minimize energy. Minimize several times until the energy value changes by less than 100. Under CHARMm, select Minimization Options. Choose ADOPTED-BASIS NEWTON-RHAPSON. Up until now, you have been using steepest descents, which will give a gross minimization (kind of like using the coarse control on an instrument). Adopted basis Newton-Rhapson should now converge mathematically on an overall energy minimum (like using the fine control on an instrument). Nevertheless, you may have to perform 8-10 minimizations before the energy stops changing. In the meantime, amuse yourself by reading the Quanta/CHARMm manuals. After you are done minimizing the structure, you may go back and alter the substrate by protonating it, adding functional groups, etc., and then attempting to minimize this new structure. Theoretically, you should see some major differences in energy.

Energy terms:

The Charmm energy function uses molecular mechanics to determine the energy of molecules. These include both contributions from both bonded and nonbonded atoms. In molecular mechanics, the atoms are viewed as masses connected by springs, with force constants proportional to the bond strength. The total energy is broken down s into 7 terms: Ebond, a vibrational potential; Eangle, an angle bending potential; Edihedryal, a potential function arising from rotation around dihedryal angles; Eimproper, for an out of plane sp2 wag;. Nonbonded potentials include Eelect, which reflects the electrostatic potential between charged groups, and Evdw, which is modeled by the 6-12 Lennard-Jones potential (reflecting van der Waals attraction and repulsion). The crystal structure should be close to the global free energy minimum. When Charmm energy is used, the total energy is displayed, as well as the contribution of each of the energy terms listed above. The total energy should be negative. The Ebond, Eangle, Edihed, and E improp should be close to 0 since these the atoms should be close to their equilibrium, low energy positions. The largest contributions to the energy is the Eelect. followed by Evdw. The Eelect is so negative since Charmm assigns a partial charge to each atom, and calculates the Eelect from these assignments. The actual numbers, as far as I can surmise, have little meaning. Only changes in the numbers on structural changes are important. In addition, the relative size of the different energy terms is important. If the Evdw is a large positive number, for instance, the implication is that their is van der Waals overlap of some atoms, which would obviously destabilize the molecule. This could imply that the original crystal structure needs refinement.

PERFORMING A CONFORMATIONAL SEARCH AND ANALYZING THE VARIOUS CONFORMATIONS USING QUANTA/CHARMm

Creating the tripeptide: Once in Quanta, go under APPLICATIONS, go under Builders, and then click on Sequence Builders. Choose the AMINOH.RTF file and open it. Click on the amino acids you want in your sequence. Then go under Sequence Builder and click on Return to Molecular Modeling. When asked if you wish to save your changes, click on YES. Place the arrow in the blue box, type the name of your file and press enter. At this point your amino acid sequence should appear on the screen. Now go under APPLICATIONS and click on Conformational Search. In the Conformational Search Box, click on Torsions. Now go into the Torsions box, and click on Define Peptide Backbone Torsions. When you have done this, click on Save Torsions. Again, give the file a name(it can be the same name), and save it. When you have finished this, click on Exit Torsions. At this point, we are ready to begin our doing our search. In the Conformational Search box, click on Setup Search and then on Random Sampling. Define the number of samples you want and the Torsion angle window. You may use a variety of numbers, but I would suggest 100 and 360, respectively. When you have completed this, click on the box in front of calculate Charm Energy for Each Structure, and on the box in front of Display Each Structure. These boxes should now have checks in them. If not, go back and click on them again. When this is done, click on OK. Now go into the Conformational Search Box and click on Do Search. The computer will then ask you to select or define an output search file. Once again, you must give it a name(again, it can be the same name), and click on NEW. At this point, the computer will begin to go through all of the possible conformations.

When the computer is done, click on ANALYSIS in the Conformational Search box. You now want to limit the conformations to only those with negative energies. To get rid of all of the positive energy conformations, click on Trace in the Plot box. When the box shows up in the middle of the screen, make sure that Potential Energy is chosen and them click on OK. When the graph appears, go under Trace Tools, and click on Set Scales. Click in the box before the Y-axis scale, click on theMax: box, and type in the number zero. When you have completed this, click on the OK box. Again, go under Trace Tools, but this time click on Select Y-range. If you look on the bottom of the trace graph, you will see that it tells you to select locations in the graph view. To do this, click on the top of one of the green peaks until a pink crosshair appears. You may have to click on it a number of times. When this is done, click on the bottom of the lowest peak until several yellow dots appear. Again, you may have to click a number of times. When you have completed this, go into the Analysis box located on the right and click on File Options/Filters. A box will appear in the middle of the screen. Click on the circle in front of Save Selected Conformations and then click on OK. Make sure the crs. File is selected and then press OK. At this point, you will again need to give your file a name. This time, give it a new name and click on NEW. When this is finished, go into the Analysis box and click on Exit Analysis. Now go back into the Conformational Search box and click on Analysis. Make sure that the search file(.csr) is chosen and then click on OK. Click on the file that you gave your new name to and then click on Open. You have now completed your conformational search and have limited it to only those conformations with negative energies. At this point, you can view your data a number of different ways. You must simply chose form the Plots box how you would like it represented.