pyworkflow.protocol.protocol module

This modules contains classes required for the workflow execution and tracking like: Step and Protocol

class pyworkflow.protocol.protocol.FunctionStep(func=None, funcName=None, *funcArgs, **kwargs)[source]

Bases: Step

This is a Step wrapper around a normal function This class will ease the insertion of Protocol function steps through the function _insertFunctionStep

Params:

func: the function that will be executed. funcName: the name assigned to that function (will be stored) *funcArgs: argument list passed to the function (serialized and stored) **kwargs: extra parameters.

class pyworkflow.protocol.protocol.LegacyProtocol(**kwargs)[source]

Bases: Protocol

Special subclass of Protocol to be used when a protocol class is not found. It means that have been removed or it is in another development branch. In such, we will use the LegacyProtocol to simply store the parameters and inputs/outputs.

classmethod getClassDomain()[source]

Return the Domain class where this Protocol class is defined.

class pyworkflow.protocol.protocol.ProtImportBase(**kwargs)[source]

Bases: Protocol

Base Import protocol

class pyworkflow.protocol.protocol.ProtStreamingBase(**kwargs)[source]

Bases: Protocol

Base protocol to implement streaming protocols. stepsGeneratorStep should be implemented (see its description) and output should be created at the end of the processing Steps created by the stepsGeneratorStep. To avoid concurrency error, when creating the output, do it in a with self._lock: block. Minimum number of threads is 3 and should run in parallel mode.

stepsGeneratorStep()[source]

This step should be implemented by any streaming protocol. It should check its input and when ready conditions are met call the self._insertFunctionStep method.

Returns

None

class pyworkflow.protocol.protocol.Protocol(**kwargs)[source]

Bases: Step

The Protocol is a higher type of Step. It also have the inputs, outputs and other Steps properties, but contains a list of steps that are executed

addSummaryWarning(warningDescription)[source]

Appends the warningDescription param to the list of summaryWarnings. Will be printed in the protocol summary.

allowsDelete(obj)[source]
allowsGpu()[source]

Returns True if this protocol allows GPU computation.

appendJobId(jobId)[source]

Append active jobs to the list

checkSummaryWarnings()[source]

Checks for warnings that we want to tell the user about by adding a warning sign to the run box and a description to the run summary. List of warnings checked: 1. If the folder for this protocol run exists.

citations()[source]

Return a citation message to provide some information to users.

cleanExecutionAttributes()[source]

Clean all the executions attributes

cleanTmp()[source]

Delete all files and subdirectories under Tmp folder.

cleanWorkingDir()[source]

Delete all files and subdirectories related with the protocol

closeMappers()[source]

Close the mappers of all output Sets.

continueFromInteractive()[source]

TODO: REMOVE this function. Check if there is an interactive step and set as finished, this is used now mainly in picking, but we should remove this since is weird for users.

copy(other, copyId=True, excludeInputs=False)[source]

Copies its attributes into the passed protocol

Parameters
  • other – protocol instance to copt the attributes to

  • copyId – True (default) copies the identifier

  • excludeInputs – False (default). If true input attributes are excluded

copyDefinitionAttributes(other)[source]

Copy definition attributes to other protocol.

property cpuTime

Return the sum of all durations of the finished steps

debug(message)[source]
deleteOutput(output)[source]
error(message, redirectStandard=True)[source]
evalExpertLevel(paramName)[source]

Return the expert level evaluation for a param with the given name.

evalParamCondition(paramName)[source]

Eval if the condition of paramName in _definition is satisfied with the current values of the protocol attributes.

evalParamExpertLevel(param)[source]

Return True if the param has an expert level is less than the one for the whole protocol.

findAttributeName(attr2Find)[source]
getCitations(bibTexOutput=False)[source]
classmethod getClassDomain()[source]

Return the Domain class where this Protocol class is defined.

classmethod getClassLabel(prependPackageName=True)[source]

Return a more readable string representing the protocol class

classmethod getClassPackage()[source]

Return the package module to which this protocol belongs. This function will only work, if for the given Domain, the method Domain.getProtocols() has been called once. After calling this method the protocol classes are registered with it Plugin and Domain info.

classmethod getClassPackageName()[source]
classmethod getClassPlugin()[source]
getDbPath()[source]
getDefaultRunName()[source]
getDefinition()[source]

Access the protocol definition.

getDefinitionDict()[source]

Similar to getObjDict, but only for those params that are in the form. This function is used for export protocols as json text file.

getEnumText(paramName)[source]

This function will retrieve the text value of an enum parameter in the definition, taking the actual value in the protocol.

Parameters

paramName – the name of the enum param.

Returns

the string value corresponding to the enum choice.

getFileTag(fn)[source]
getFiles()[source]
getGpuList()[source]
classmethod getHelpText()[source]

Get help text to show in the protocol help button

getHostConfig()[source]

Return the configuration host.

getHostFullName()[source]

Return the full machine name where the protocol is running.

getHostName()[source]

Get the execution host name. This value is only the key of the host in the configuration file.

getInputStatus()[source]

Returns if any input pointer is not ready yet and if there is any pointer to an open set

getJobIds()[source]

Return an iterable list of jobs Ids associated to a running protocol.

getLogPaths()[source]
getLogsAsStrings()[source]
getLogsLastLines(lastLines=None, logFile=0)[source]

Get the last(lastLines) lines of a log file.

:param lastLines, if None, will try ‘PROT_LOGS_LAST_LINES’ env variable, otherwise 20 :param logFile: Log file to take the lines from, default = 0 (std.out). 1 for stdErr.

getMapper()[source]
getObjectTag(objName)[source]
getOutputFiles()[source]

Return the output files produced by this protocol. This can be used in web to download results back.

getOutputsSize()[source]
getPackageCitations(bibTexOutput=False)[source]
getParam(paramName)[source]

Return a _definition param give its name.

getParsedMethods()[source]

Get the _methods results and parse possible cites.

getPath(*paths)[source]

Same as _getPath but without underscore.

getPid()[source]
classmethod getPlugin()[source]
classmethod getPluginLogoPath()[source]
getPossibleOutputs()[source]
getProject()[source]
getProtocolsToUpdate()[source]

This function returns a list of protocols ids that need to update their database to launch this protocol (this method is only used when a WORKFLOW is restarted or continued). Actions done here are:

  1. Iterate over the main input Pointer of this protocol

    (here, 3 different cases are analyzed):

    A #. When the pointer points to a protocol

    B #. When the pointer points to another object (INDIRECTLY).

    The pointer has an _extended value (new parameters configuration in the protocol)

    C #. When the pointer points to another object (DIRECTLY).

    • The pointer has not an _extended value (old parameters configuration in the protocol)

  2. The PROTOCOL to which the pointer points is determined and saved in

    the list

  3. If this pointer points to a set (case B and C):

  • Iterate over the main attributes of the set - if attribute is a pointer then we add the pointed protocol to the ids list

getQueueParams()[source]
getRelations()[source]

Return the relations created by this protocol.

getRunMode()[source]

Return the mode of execution, either: MODE_RESTART or MODE_RESUME.

getRunName()[source]
getScheduleLog()[source]
getSize()[source]

Returns the size of the folder corresponding to this protocol

getStatusMessage()[source]

Return the status string and if running the steps done.

getStderrLog()[source]
getStdoutLog()[source]
getSteps()[source]

Return the steps.sqlite file under logs directory.

getStepsFile()[source]

Return the steps.sqlite file under logs directory.

getStepsGraph(refresh=True)[source]

Build a graph taking into account the dependencies between steps. In streaming we might find first the createOutputStep (e.g 24) depending on 25

getSubmitDict()[source]

Return a dictionary with the necessary keys to launch the job to a queue system.

classmethod getUrl()[source]
getWorkingDir()[source]
static hasDefinition(cls)[source]

Check if the protocol has some definition. This can help to detect “abstract” protocol that only serve as base for other, not to be instantiated.

hasExpert()[source]

This function checks if the protocol has any expert parameter

hasQueueParams()[source]
hasSummaryWarnings()[source]
info(message, extra=None)[source]
classmethod isBase()[source]

Return True if this Protocol is a base class. Base classes should be marked with _label = None.

classmethod isBeta()[source]
isChild()[source]

Return true if this protocol was invoked from a workflow (another protocol)

isContinued()[source]

Return if running in continue mode (MODE_RESUME).

classmethod isDisabled()[source]

Return True if this Protocol is disabled. Disabled protocols will not be offered in the available protocols.

isInStreaming()[source]
classmethod isInstalled()[source]
classmethod isNewDev()[source]
classmethod isUpdated()[source]
iterDefinitionAttributes()[source]

Iterate over all the attributes from definition.

iterDefinitionSections()[source]

Iterate over all the section of the definition.

iterInputAttributes()[source]

Iterate over the main input parameters of this protocol. Now the input are assumed to be these attribute which are pointers and have no condition.

iterInputPointers()[source]

This function is similar to iterInputAttributes, but it yields all input Pointers, independently if they have value or not.

iterOutputAttributes(outputClass=None, includePossible=False)[source]

Iterate over the outputs produced by this protocol.

legacyCheck()[source]

Hook defined to run some compatibility checks before display the protocol.

loadMappers()[source]

Open mapper connections from previous closed outputs.

loadSteps()[source]

Load the Steps stored in the steps.sqlite file.

makePathsAndClean()[source]

Create the necessary path or clean if in RESTART mode.

makeWorkingDir()[source]
methods()[source]

Return a description about methods about current protocol execution.

property numberOfSteps
processImportDict(importDict, importDir)[source]

This function is used when we import a workflow from a json to process or adjust the json data for reproducibility purposes e.g. resolve relative paths Params: importDict: Dict of the protocol that we got from the json importDir: dir of the json we’re importing

removeJobId(jobId)[source]

Remove inactive jobs from the list

requiresGpu()[source]

Return True if this protocol can only be executed in GPU.

run()[source]

Before calling this method, the working dir for the protocol to run should exist.

runJob(program, arguments, **kwargs)[source]
runProtocol(protocol)[source]

Setup another protocol to be run from a workflow.

setAborted()[source]

Abort the protocol, finalize the steps and close all open sets

setFailed(msg)[source]

Set the run failed and close all open sets.

setHostConfig(config)[source]
setHostFullName(hostFullName)[source]
setHostName(hostName)[source]

Set the execution host name (the host key in the config file)

setJobId(jobId)[source]

Reset this list to have the first active job

setJobIds(jobIds)[source]

Reset this list to have a list of active jobs

setMapper(mapper)[source]

Set a new mapper for the protocol to persist state.

setPid(pid)[source]
setProject(project)[source]
setQueueParams(queueParams)[source]
setRunning()[source]

Do not reset the init time in RESUME_MODE

setStepsExecutor(executor=None)[source]
setWorkingDir(path)[source]
property stepsDone

Return the number of steps executed.

summary()[source]

Return a summary message to provide some information to users.

updateSteps()[source]

After the steps list is modified, this methods will update steps information. It will save the steps list and also the number of steps.

useQueue()[source]

Return True if the protocol should be launched through a queue.

useQueueForProtocol()[source]

This function will return True if the protocol has been set to be launched through a queue

useQueueForSteps()[source]

This function will return True if the protocol has been set to be launched through a queue by steps

usesGpu()[source]
validate()[source]

Check that input parameters are correct. Return a list with errors, if the list is empty, all was ok.

classmethod validateInstallation()[source]

Check if the installation of this protocol is correct. By default, we will check if the protocols’ package provide a validateInstallation function and use it. Returning an empty list means that the installation is correct and there are not errors. If some errors are found, a list with the error messages will be returned.

classmethod validatePackageVersion(varName, errors)[source]

Function to validate the package version specified in configuration file ~/.config/scipion/scipion.conf is among the available options and it is properly installed.

Parameters
  • package – the package object (ej: eman2 or relion). Package should contain the following methods: getVersion(), getSupportedVersions()

  • varName – the expected environment var containing the path (and version)

  • errors – list of strings to add errors if found

warning(message, redirectStandard=True)[source]
warnings()[source]

Return some message warnings that can be errors. User should approve to execute a protocol with warnings.

classmethod worksInStreaming()[source]
class pyworkflow.protocol.protocol.RunJobStep(runJobFunc=None, programName=None, arguments=None, resultFiles=[], **kwargs)[source]

Bases: FunctionStep

This Step will wrapper the commonly used function runJob for launching specific programs with some parameters. The runJob function should be provided by the protocol when inserting a new RunJobStep

Params:

func: the function that will be executed. funcName: the name assigned to that function (will be stored) *funcArgs: argument list passed to the function (serialized and stored) **kwargs: extra parameters.

class pyworkflow.protocol.protocol.Step(**kwargs)[source]

Bases: Object

Basic execution unit. It should defines its Input, Output and define a run method.

addPrerequisites(*newPrerequisites)[source]
getElapsedTime(default=datetime.timedelta(0))[source]

Return the time that took to run (or the actual running time if still is running )

getError()[source]
getErrorMessage()[source]
getIndex()[source]
getPrerequisites()[source]
getStatus()[source]
isAborted()[source]
isActive()[source]
isFailed()[source]
isFinished()[source]
isInteractive()[source]
isLaunched()[source]
isNew()[source]
isRunning()[source]
isSaved()[source]
isScheduled()[source]
isWaiting()[source]
run()[source]

Do the job of this step

setAborted()[source]

Set the status to aborted and updates the endTime.

setFailed(msg)[source]

Set the run failed and store an error message.

setFinished()[source]

Set the status to finish updates the end time

setIndex(newIndex)[source]
setInteractive(value)[source]
setPrerequisites(*newPrerequisites)[source]
setRunning()[source]

The the state as STATE_RUNNING and set the init and end times.

setSaved()[source]

Set the status to saved and updated the endTime.

setStatus(value)[source]
class pyworkflow.protocol.protocol.StepSet(filename=None, prefix='', mapperClass=None, **kwargs)[source]

Bases: Set

Special type of Set for storing steps.

pyworkflow.protocol.protocol.getProtocolFromDb(projectPath, protDbPath, protId, chdir=False)[source]

Retrieve the Protocol object from a given .sqlite file and the protocol id.

pyworkflow.protocol.protocol.getUpdatedProtocol(protocol)[source]

Retrieve the updated protocol and close db connections

pyworkflow.protocol.protocol.isProtocolUpToDate(protocol)[source]

Check timestamps between protocol lastModificationDate and the corresponding runs.db timestamp

pyworkflow.protocol.protocol.runProtocolMain(projectPath, protDbPath, protId)[source]

Main entry point when a protocol will be executed. This function should be called when:

scipion runprotocol ...
Parameters
  • projectPath – the absolute path to the project directory.

  • protDbPath – path to protocol db relative to projectPath

  • protId – id of the protocol object in db.