12. Structure validation and comparison
Note
Structure validation is a model building step that you have to perform recursively during the refinement process to assess if you are improving your structure or not. Once you finish the refinement process you’ll obtain the final assessment values. These values should be in a certain range if you want to submit the atomic structure to databases. These final validation scores should be computed regarding the density map that you submit as main map, although during the recursive process you might have used the sharpened maps for refinement/validation.
12.1. EMRinger
Specifically designed for cryo-EM data, EMRinger tool assesses the appropriate fitting of a model to a map, validating high-resolution features such as side chain arrangements. The placement of side chains regarding the molecule skeleton depends on the \chi_{1} dihedral angle (a dihedral angle is the angle between two intersecting planes), which is determined by atomic positions of (N, C\alpha, C\beta) and (C\alpha, C\beta, C\gamma) (see Fig. 12.2). The side chain dihedral angles tend to cluster near 180^\circ and \pm60^\circ. The lower deviations regarding these values, the better model, and the EMRinger higher value.
We can start assessing with EMRinger the metHgb \alpha subunit models that we have generated along the modeling workflow. In each case, open the protocol phenix-emringer (Fig. 12.3 (1)), load the extracted map asymmetric unit (initial or saved with Coot) (2) and the atomic structure that you’d like to validate in relation to the map (3), execute the program (4) and analyze results (5). A menu to check results in detail will be opened (bar EMRinger results). Phenix EMRinger plots with density thresholds, with rolling window for each chain, as well as dihedral angles for each residue are shown here. The most relevant results, especially the EMRinger score, will also be written in the protocol SUMMARY (6).
12.2. MolProbity
Run MolProbity protocol to obtain its statistics after running ChimeraX rigid fit, Coot refinement, PHENIX real space refine (form parameters indicated in Fig. 11.9) after Coot, and Refmac refinement with MASK before and after PHENIX real space refine.
12.3. Validation CryoEM
12.4. Model Comparison
The question posed in the previous item does not have an easy answer in the real world, in which we do not know the final atomic structure. In this tutorial, nevertheless, we know the atomic structure already published for this cryo-EM map and we may wonder how far we are from it. The question can be answered by comparing a) validation statistics that we have obtained for our models with the statistics computed for the available \alpha subunit in PDB structure 5NI1, and b) the atomic structures themselves by overlapping.
12.4.1. Comparison of validation statistics
Validation statistics of metHgb \alpha subunit of PDB structure 5NI1 should be obtained as first step to compare them with validation statistics of our models. With this aim we are going to follow the workflow remarked in Fig. 12.6:
- Protocol import atomic structure:Download from PDB structure 5NI1
- Protocol chimerax-operate (Appendix CHIMERAX operate):Similar to ChimeraX rigid fit, ChimeraX operate protocol allows to perform operations with atomic structures. We are going to use this protocol to save independently in Scipion the metHgb \alpha subunit. Open the protocol (Fig. 12.7 (1)), complete the parameter PDBx/mmCIF including the atomic structure 5NI1 previously imported (2), and execute the protocol (3).The ChimeraX graphics window will be opened with the structure 5NI1 as model number #2. To save independently the structure of human metHgb \alpha subunit (chain A), write in ChimeraX command line:
select #2/A save /tmp/5ni1_chainA.cif format mmcif models #2 selectedOnly true open /tmp/5ni1_chainA.cif scipionwrite #3 prefix 5ni1_chainA_
Remark that the model saved in ChimeraX command line includes both the aminoacid chain and the HEME group. In case you are interested in extracting only the aminoacid chain, you can use the protocol atomstructutils-operator, specifically designed to extract/add individual chains from/to an atomic structure (Appendix Atomic Structure Chain Operator). Compare the results of protocols ChimeraX operate and Atomic Structure Chain Operator in Fig. 12.8. The red arrow points at HEME group. Protocol phenix-dock in map:
Open PHENIX dock in map protocol and follow the instructions above indicated. The structure saved in ChimeraX operate will replace this time our previous model. Results can be observed in Fig. 12.9.- Protocol chimerax-rigid fit: Open again ChimeraX rigid fit protocol and, following the already indicated instructions, include this time the atomic structure placed_model.cif generated in the previous step. To fit the metHgb \alpha subunit from 5NI1 structure in the extracted asymmetric unit and save the fitting write in ChimeraX command line:
fitmap #3 inMap #2 scipionwrite #3 prefix 5ni1_chainA_fitted_
Validation protocols phenix-emringer and phenix-validation_cryoem:
Compute validation statistics with these two protocols for metHgb \alpha subunit from PDB structure 5NI1, write respective values in the previous table (Table 2), and compare them with the statistics of our models.
Considering results shown in appendix Solutions; Question9) for metHgb \alpha subunit, we can conclude that published structures are not perfect and we are not very far from this published one. In fact, we have overcome every statistic except CC(mask). Nevertheless, the different models generated after Coot refinement can still be improved by iterative refinement processes. Validation statistics thus allow to follow the quality improvement of atomic models.Comparison of atomic structures
PHENIX protocol phenix-superpose pdbs allows to compare two atomic structures by overlapping them. Root mean square deviation (RMSD) between the fixed structure (the published one) and one of our models supports the classification of models according to its similarity to the published model. Open PHENIX superpose pdbs protocol form (Fig. 12.10 (1)), include the published structure of the metHgb \alpha subunit as fixed structure (2), each one of the models generated along the worflow (3), execute the protocol (4) and check results by pressing Analyze results (5). Arrows of Fig. 12.11 remark differing parts between the atomic structure of the metHgb \alpha subunit from PDB structure 5NI1 (green) and our model generated by automatic refinement with PHENIX real space refine protocol (pink). By opening these structures in you can see the differences between them. Finally, complete the Table 2 with the value of RMSD (final) (6) obtained for each model. (Answers in appendix Solutions; Question9).