11. Refinement: Flexible fitting
Although the rigid fitting approximates map and atomic model, a detailed visual inspection of map and model reveals that some residues are not perfectly fitted. In order to get a better fit, not only of the carbon skeleton but also of residue side chains, a flexible fitting or refinement has to be accomplished. Refinement can thus be defined as the optimization process of fitting model parameters to experimental data. Different strategies, categorized as refinement in the real space and refinement in the Fourier space, can be followed. Implemented in Scipion are two protocols for real space refinement, ccp4-coot refinement (Appendix Coot refinement [Emsley et al., 2010]) and phenix-real space refine (Appendix PHENIX Real space refine [Afonine et al., 2018]), interactive and automatic, respectively, and one automatic protocol to refine the model in the reciprocal space, ccp4-refmac (Appendix CCP4 Refmac [Vagin et al., 2004]).
Observe the new steps in the modeling Scipion workflow in Fig. 11.1.
11.1. CCP4 Coot Refinement
Initially devoted to atomic models obtained by X-ray crystallography methods, Coot (from Crystallopgraphic Object-Oriented Toolkit) is a 3D computer graphics tool that allows simultaneous display of map and fitted model to accomplish mostly interactive modeling operations. Although this tutorial does not try to show every functionality of Coot, but indicate how to open, close and save partial and final refined structures in Scipion, some of Coot basic relevant commands will be shown. Initially, we are going to refine our model with Coot. First of all, open the ccp4-coot refinement protocol (Fig. 11.2 (1)), load the map asymmetric units (2), with electron density normalized to 1 (Coot performs this step by default), and the fitted structure model (3). To read the protocol Help is recommended. After executing the protocol (4), the Coot graphics window will appear to start working.
According to Fig. 11.3 (B), MET residue of the new chain A does not fit to the map density. Maybe this residue has been processed post-translationally, as we have anticipated in Input data description (Sequences section). To solve this question, go to Coot main menu and select Draw -> Go to Atom… -> Chain A -> A 1 MET (Fig. 11.4 (A)). MET residue will be located in the center of Coot graphics window. Check if this residue is surrounded by any electron density. As Fig. 11.4 (B)(1) shows, no density associates to the first chain residue. MET will thus be deleted. Then go to the lower right side menu and select the symbol to delete items (B)(2). Select Residue/Monomer in the opened Delete item window, and click the MET residue that you want to delete. Go again to Validate -> Density fit analysis and check if the orange bar shown in MET residue Fig. 11.3 (D) has disappeared.
[myvars]
imol: 0
aa main chain: A
aa auxiliary chain: AA
aaNumber: 4
step: 10
w
. If you want to add a special
label to identify the atomic structure in the Scipion workflow you can save
that label in Coot main menu Calculate -> Scripting -> Python and the Coot Python Scripting window will be opened and you can
write there your label name, for example label1_HBA_HUMAN. This label will appear in
the Summary window of the Scipion framework (Fig. 11.7 (A)). Assuming that #0 is your model number,
write in Command:scipion_write (0, ’label1_HBA_HUMAN’)
e
to fully
stop the Coot protocol.Note
about chain IDs: Check the id of each chain. Although you have the possibility of changing this id in ChimeraX, as we have seen in the subsection Structural models of human metHgb subunits from templates (metHgb \beta subunit), you also have the possibility of performing this task in Coot, as it is shown in the next example in which we change the chain id from A to B. To change the name of the chain, go to the Coot main menu and select the option Edit (Fig. 11.8 (A)(1)) and then Change chain IDs and select the current name of the chain A (Fig. 11.8 (B)(2)) by the new one, B (3).
11.2. PHENIX Real Space Refine
The first tab of results shows the initial model atomic structure (Fig. 11.10 (pink)) as well as the refined one (green), both fitted to the normalized map asymmetric unit saved in Coot.
- What is the *CC(mask)* value?
- Which one is the residue that shows the lower correlation value?
Why?
- What is that correlation value?
- Which one is the second residue that shows the lower correlation
value? Why?
- What is that correlation value?
- What is the correlation value of *HEME* group?
Note
An interesting application of the PHENIX real space refine visualization tools is the possibility of load Coot from the PHENIX viewer and correct the structure of outliers residues and clashes. A recursively use of PHENIX real space refine and Coot protocols is thus possible.
11.3. PHENIX Search Fit
An extension of PHENIX Real Space Refine is phenix-search fit, a protocol implemented in Scipion to fit a small sequence of residues in a certain density of the map and, afterwards, perform the subsequent refinement in the real space (Appendix PHENIX Search fit). Let us to illustrate the applicability of this protocol with the workflow described in the Fig. 11.11.
This example shows a small fraction of residues from the metHgb \alpha subunit that was not completely modeled, except for the skeleton of \alpha carbons. The sequence of the chain is perfectly known, but for certain residues we were unable of tracing the lateral side chains of those residues and only ALA residues appear in our atomic structure. A detail of the small fragment of ALA residues can be observed in the Fig. 11.12 (red arrows). The protocol phenix-search fit might help us to replace the ALA residues by the appropriate aminoacids.
As the Fig. 11.11 indicates, the protocol phenix-search fit (4) requires three different inputs (1, 2 and 3):
- Initial map that contains the density of the metHgb \alpha subunit. In this case we use the asymmetric unit map extracted previously (subsection Extraction of the asymmetric unit map, Fig. 7.12).
- Small fragment of atomic structure that contains the ALA small chain. To create this fragment we start from the published atomic structure of the human metHgb \alpha subunit (included in the model of the PDB ID 5NI1, which can be downloaded from the database using the protocol import atomic structure. Next, we use the protocol chimerax-operate to isolate the chain A of the structure. The atomic structure 5NI1 is the only one input of the protocol chimerax-operate. After the opening of ChimeraX, write in the command line:
sel #2 & ~ #2/A del sel scipionwrite #2 prefix 5ni1_chainA_
After saving the chain A of the atomic structure 5NI1, run the protocol phenix-dock in map (Fig. 10.2) to fit the chain A from the atomic structure 5NI1 in the metHgb asymmetric unit map density. Next, open again the protocol chimerax-rigid fit (Fig. 10.4) and, following the previous instructions and the next ChimeraX command lines, finish the fitting, mutate the sequence between residues 94 and 118 to generate the ALA chain, and finally save the small mutated fragment:fitmap #3 inMap #2 scipionwrite #3 prefix 5ni1_chainA_fitted_ select #3 & ~ #3/A:94-118 del sel swapaa #3/A:94-118 ALA scipionwrite #3 prefix 5ni1_chainA_94_118_MutALA_
Sequence of the metHgb \alpha subunit imported previously in subsection Sequences (Fig. 6.4).
With these three previous inputs we can complete the phenix-search fit protocol form (Fig. 11.13). Open it in the Scipion left menu (1) and include the asymmetric unit map (2) detailing its resolution (3), as well as the small fragment of mutated structure previously saved (4), the sequence downloaded (5) and take advantage of the wizard on the right (6) to select the initial and final residues that delimite the sequence to search.
After executing the phenix-search fit protocol (Fig. 11.13 (7)) we can have a look to the results. By pressing Analyze Results (Fig. 11.13 (8)) a window with the Viewer menu is opened (Fig. 11.14 (A)). This menu allows to visualize a certain number of atomic structures, according to their ranking scores, with lateral side chains fitted in the map density (1). Those structures will be opened in ChimeraX (2) surrounded by the density located at 3.0 Å of the structure (3). The number 1000 shown by default in (1) allows displaying all atomic structures. By pressing Summary Plot (4) a pop up window will open and show the score values of each structure, as well as the average and standard deviation of those values (Fig. 11.14 (B)). If we select the visualization of a certain number of atomic structures, 5 for example, as points the red arrow in Fig. 11.14 (C), the five best score values will appear remarked in red in the Summary Plot.
open 5ni1
select #9 & ~ #9/A:94-118
del sel
mmaker #9 to #4
11.4. CCP4 Refmac
As in the case of Coot, Refmac (from maximum-likelihood Refinement of Macromolecules) was initially developed to optimize models obtained by X-ray crystallography methods but, unlike Coot, automatically and in reciprocal space. The models refined in the real space with Coot and PHENIX real space refine, successively, will be used as inputs to perform a second refinement step in the Fourier space with Refmac protocol ccp4-refmac. Firstly, open the Refmac protocol form (Fig. 11.15 (1)), load the volume generated by Coot (2), the atomic structure obtained with Coot (case 3 of Fig. 11.1) (3) or with PHENIX real space refine after Coot (case 4 of Fig. 11.1), and the volume resolution as maximum resolution (4). Execute the protocol (5) and when it finishes, analyze the results (6).
Clicking the first item in the display menu of results (Fig. 11.16 (1)), ChimeraX graphics window will be opened showing the input volume, the initial model (new_label_HBA_HUMAN) obtained with Coot (Fig. 11.17, pink), and the final Refmac refined model (Fig. 11.17, green). By clicking the third item in the display menu of results (Fig. 11.16 (2)), a summary of results are shown. Check if values of R factor and Rms BondLength have improved with this refinement process in these three cases:
- Running Refmac after phenix-real space refine without a mask:Compare previous Refmac results (after Coot and phenix-real space refine) with those obtained selecting the option No in the protocol form parameter Generate masked volume. Use two different volumes, the one generated by Coot protocol, and the one generated by the extract asymmetric unit protocol. Are there any differences? Why? (Answers in appendix Solutions; Question7)
Have a look to the rest of items in the display window of results.
11.5. The best refinement workflow
At this point we wonder about the optimal steps to follow in the refinement process. Should we have to use Coot first, then PHENIX, then Refmac?, or maybe, with a different map and model, should we start with the automatic refinement and then go to the interactive one? The right answer is that there is no a unique answer. The strategies and the number of steps of refinement might differ and the only requirement is that the next step in refinement should generate a better structure than the previous one. This premise requires to apply common validation criteria to assess the progressive improvement of our model.