1. ModelAngelo Initial Model Prediction protocol

Protocol modelangelo - model builder from scipion-em-modelangelo plugin has been designed to get ModelAngelo de novo atomic structure predictions using a 3D density map as restraint [Jamali et al., 2023]. This method is useful to trace the structure of map areas with quite good resolution (commonly lower than 4 angstroms), and to assign residues to the density that can be starting clues to localize a protein that can be manually traced.

  • Requirements to run this protocol and visualize results:
    • Scipion plugin: scipion-em
    • Scipion plugin: scipion-em-modelangelo
    • Scipion plugin: scipion-em-chimera
  • Scipion menu: Model building -> Initial model (Fig. 1.5 (A))
    Protocol **scipion-em-modelangelo**. A: Protocol location in *Scipion* menu. B: Protocol form to generate *ModelAngelo* structure predictions.

    Fig. 1.5 Protocol scipion-em-modelangelo. A: Protocol location in Scipion menu. B: Protocol form to generate ModelAngelo structure predictions.

  • Protocol form parameters (Fig. 1.5 (B)):
    Input section:
    • Refined volume: Mandatory param to load the electron density map previously downloaded or generated by refinement in Scipion.
    • Protein sequences: Optional param to load one or several protein sequences experimentally identified, for example from a mass spectrometry assay. When sequence information is added, ModelAngelo will try to fit it in the map. If no sequence at all is provided, ModelAngelo will try to guess a fitting sequence according to the map density for each residue.
    • Volume mask: Optional param for loading a mask to restrict the computation to predict the initial model to the area covered by the non-zero values of the mask.
    • Configuration File: Advanced param that allows to modify params included in the configuration file.
  • Protocol execution:
    Adding specific protocol label is recommended in Run name section, at the form top. To add the label, open the protocol form, press the pencil symbol at the right side of Run name box, complete the label in the new opened window, press OK and, finally, close the protocol. This label will be shown in the output summary content (see below). If you want to run again this protocol, do not forget to set to Restart the Run mode.
    To accelerate the execution, this protocol runs using a GPU with at least 8GB of memory (see documentation). Optionally, if you do not have access to GPU devices the protocol is able to run without them also. Nevertheless, it takes much more time. Anyway, do not forget to select in the form Run window Yes or No in FGPU IDs, as well as writing the number of the GPU used (0 by default).
    Press the Execute red button at the form bottom.
  • Visualization of protocol results:
    After executing the protocol, results can be visualized in Chimera GUI clicking Analyze Results. Besides the coordinate axes, two different types of results will be observed depending on the inputs. If sequence information has been included as input, the atomic structure prediction will be displayed in two ways, raw and pruned. Raw or unpruned structures, directly obtained as protocol output, include every structural element and display minimal post-processing. Most of the model has thus been built. However, lower resolution areas of the model are often wrong. A manual inspection of these areas is basic to manually remove incorrect assigned residues. In pruned predictions, instead, this clearance task has already been performed in an automatic way. Every portion of the model that do not have good matches to the sequence based on a hidden Markov model alignment is removed. In this case, the part of the interpreted map is lower but most of residues are correctly assigned to the density. Go to Fig. 6.20 to see an example of this protocol output.
    When no sequence information has been included as protocol input, only a raw structure is generated as ModelAngelo prediction (example in Fig. 6.22). Without sequence information, ModelAngelo itself assigns the residues according to the density trying to cover the whole map. Then, the user should inspect the structure and perform the task of clearence of residues incorrectly assigned.
    Prediction outputs are colored according to the values of the B-factor column in the structure file using Alphafold color criteria and Color key appears in the GUI main window. REMARK that good matches to the sequence and not those B-factor values are the criteria for pruning residues.
  • Summary content:
    • Protocol output (below framework):
      • modelangelo - model builder protein name -> raw AtomStruct(pseudoatoms=False, volume=False)
      • modelangelo - model builder protein name -> pruned AtomStruct(pseudoatoms=False, volume=False)
    • SUMMARY box:
      No summary information.