.. figure:: /docs/images/scipion_logo.gif :width: 250 :alt: scipion logo .. _parallelization: =============== Parallelization =============== By default, each function step is executed only after the previous one is fully completed. If any amount of steps should be executed in parallel, the following steps detailed below must be met: * In the protocol ``__init__`` definition add the following instruction: .. code-block:: python def __init__(self, **args): super().__init__(**args) # ... self.stepsExecutionMode = STEPS_PARALLEL # ... This attribute will make all the steps functions run at the same time by default now, and it will be necessary to define the dependencies between them to only parallelize the functions that can be parallelized. * In the protocol ``_defineParams`` method add the parallelization section defining the number of threads to use. .. code:: python def _defineParams(self, form): # Subsitute N for the number of threads you want # your protocol's form to use by default. # It has to be an integer greater than 0. form.addParallelSection(threads=N) # ... * In the protocol's ``_insertAllSteps`` function, the steps to be executed by the protocol need to be inserted with their dependencies. An example is provided below: .. code-block:: python def _insertAllSteps(self): """ In this function the steps that are going to be executed should be defined. Two of the most used functions are: _insertFunctionStep or _insertRunJobStep """ # Defining list of function ids to be waited for by the createOutputStep function deps = [] for element in myList: # Calling processConversion in parallel with each input data deps.append(self._insertFunctionStep(self.processStep, element, prerequisites=[])) # Insert output generation step self._insertFunctionStep(self.createOutputStep, prerequisites=deps) def processStep(self, element): # Do something with that element def createOutputStep(self): # Generate ouputs In this example, we have two functions: - processStep - createOutputStep ``processStep`` is a function that has to be executed once for each element in the list, and, in this case, it is going to happen in parallel. Calling function ``_insertFunctionStep``, returns an id given by Scipion to the function being inserted. Also, when calling that function, a param named ``prerequisites`` has to be supplied. This param must be a list containing all the ids corresponding to functions that need to be executed before the function being inserted can start. If that function has no dependencies, the list needs to be empty, but it still needs to be supplied, or else some errors will occur (this will get fixed soon, but, in the mean time, keep in mind that at least an empty list has to be passed). Going back to the example above, ``processStep`` has no dependencies, so the ``prerequisites`` param has an empty list for each of them. Additionally, this function takes one positional param (``element``), so that param needs to be passed before the keyword argument ``prerequisites``. Every function id being generated by the insertion of each instance of ``processStep`` has to be stored in a list, in this case ``deps``. This list will be the param ``prerequisites`` of function ``createOutputStep``. Which means that ``createOutputStep`` will only start once every instance of ``processStep`` has finished. If an empty list was passed instead of those function ids, ``createOutputStep`` will start at the same time than the rest of functions, resulting in errors if it needs some data produced by other ones. **Note:** Every step function needs to be inserted within ``_insertAllSteps``. That is, because, the protocol's GUI while running, shows a progress status in the format of StepsCompleted/TotalSteps, and TotalSteps only take account the steps introduced within the ``_insertAllSteps`` function. If there is any call to ``_insertFunctionStep`` from another function, even it that function is being called inside ``_insertAllSteps``, the protocol GUI will end up with more completed steps than total steps (i.e. 100/80). This does not break protocol's results at all, but it is not ideal for a user to look at either.