xmipp3.protocols.protocol_preprocess.protocol_preprocess module

class xmipp3.protocols.protocol_preprocess.protocol_preprocess.XmippPreprocessHelper[source]

Bases: object

Helper class that contains some Protocol utilities methods used by both XmippProtPreprocessParticles and XmippProtPreprocessVolumes.

class xmipp3.protocols.protocol_preprocess.protocol_preprocess.XmippProtPreprocessParticles(**kwargs)[source]

Bases: XmippProcessParticles

Preprocesses particle images by applying the optional steps: dust removal, randomize phase, normalize, center images, phase flip images, invert contrast, threshold, fill with below, abs_bellow or above, a threshold grey value below which all voxels should be set to 0, fill value or substitute by value binarize or average. This cleaning stage improves particle quality and consistency for downstream tasks.

AI Generated

## Overview

The Preprocess Particles protocol applies a sequence of optional preprocessing operations to a set of particle images.

Particle preprocessing is often used to clean images, standardize contrast, normalize background statistics, remove extreme pixel values, phase-flip CTF effects, or prepare particles for later alignment, classification, and reconstruction.

The protocol can apply several operations, including:

  • dust removal;

  • phase randomization beyond a selected resolution;

  • normalization;

  • image centering;

  • phase flipping;

  • contrast inversion;

  • thresholding.

The selected operations are applied in the order defined by the protocol. The main output is a new set of preprocessed particles.

## Inputs and General Workflow

The input is a set of particles.

The protocol converts the input particle set to Xmipp metadata format. It then runs the selected preprocessing operations sequentially. Each operation uses the output of the previous one, so the order of selected operations matters.

The output particle set preserves the input metadata when possible and points to the processed particle images.

## Input Particles

The Input particles parameter defines the particle set to be preprocessed.

The protocol does not change the biological identity of the particles. It changes the image values and, for phase flipping, updates the phase-flipped status of the output set.

The output particles can be used in downstream protocols such as 2D classification, alignment, reconstruction, screening, or additional cleaning.

## Dust Removal

The Dust removal option detects pixels with unusually large absolute values and replaces them with random values from a Gaussian distribution with zero mean and unit standard deviation.

This is useful for removing isolated very bright or very dark artifacts, such as detector spikes, hot pixels, or other extreme outliers.

The Threshold for dust removal parameter controls how extreme a pixel must be to be treated as dust. Pixels with signal higher or lower than this value times the image standard deviation are affected.

The default value, 3.5, is suggested for cryo-EM. For high-contrast negative stain images, a higher value may be preferable because real signal can be much stronger.

## Randomize Phases

The Randomize phases option randomizes Fourier phases beyond a selected resolution.

The Maximum Resolution parameter defines the resolution, in angstroms, beyond which phases are randomized.

This operation is useful for control experiments or for removing high-frequency phase information beyond a chosen limit. It should not be used as routine particle cleaning unless the user has a specific validation or preprocessing reason.

## Normalize

The Normalize option standardizes particle intensity values.

The protocol supports three normalization modes:

OldXmipp sets the whole-image mean to 0 and standard deviation to 1.

NewXmipp sets the background mean to 0 and background standard deviation to 1.

Ramp subtracts a background ramp and then applies the NewXmipp background normalization.

Normalization is useful because many downstream algorithms assume comparable particle intensity statistics.

## Background Radius

The Background radius parameter is used for NewXmipp and Ramp normalization.

Pixels outside this circle are treated as background. Their statistics are used to normalize the particle images. If the radius is less than or equal to 0, the protocol uses half the particle box size.

The radius should be chosen so that the background region does not include significant particle density. If the radius is too small or too large, background statistics may be biased.

The protocol validates that the background radius is not larger than the particle half-size.

## Center Images

The Center images option recenters particle images using the Xmipp image centering tool.

This can be useful when particles are approximately centered but still show small systematic shifts.

Centering should be used carefully. If particles contain multiple components, strong background artifacts, or highly asymmetric density, automatic centering may not correspond to the desired biological center.

## Phase Flip Images

The Phase flip images option applies CTF phase flipping to the particle images.

The protocol uses the particle sampling rate and the CTF information associated with the input particles.

Phase flipping changes the phase-flipped status of the output particle set. The output set is marked as phase flipped if the input was not phase flipped, and vice versa.

This option is useful when the user wants to correct CTF phase inversions while keeping a relatively simple CTF treatment.

## Invert Contrast

The Invert contrast option multiplies particle images by -1.

This is useful when the particles have the opposite contrast convention from what downstream protocols expect. For example, it can convert black particles on a white background into white particles on a darker background, or the other way around.

Incorrect contrast inversion can make later processing fail, so the output should be inspected visually.

## Threshold

The Threshold option replaces selected pixel values according to a threshold rule.

The protocol first selects pixels using one of three modes:

abs_below selects pixels whose absolute value is below the threshold.

below selects pixels below the threshold.

above selects pixels above the threshold.

Then it substitutes the selected pixels using the selected substitution mode.

## Threshold Value

The Threshold value parameter defines the gray-value cutoff used by the threshold operation.

The meaning of this value depends on the selected threshold type. For example, in below mode, pixels below this value are selected. In above mode, pixels above this value are selected.

Thresholding can be useful for cleaning or binarization-like operations, but it can also remove weak signal if used too aggressively.

## Substitute By

The Substitute by parameter controls how selected pixels are replaced.

The options are:

value, where selected pixels are replaced by the user-provided fill value;

binarize, where selected and non-selected pixels are converted into a binary representation;

avg, where selected pixels are replaced by the average of the non-selected pixels.

This parameter determines whether thresholding behaves as intensity clipping, mask-like binarization, or background substitution.

## Fill Value

The Fill value parameter is used when Substitute by = value.

It defines the numerical value assigned to selected pixels.

A common value is 0, but other values may be useful depending on the image normalization and background convention.

## Output Particles

The main output is outputParticles.

This output contains the preprocessed particle images after all selected operations have been applied.

The output particle set can be used directly in downstream Scipion protocols. When phase flipping is applied, the phase-flipped state of the output set is updated accordingly.

## Operation Order

The protocol applies operations in a defined order:

  1. dust removal;

  2. phase randomization;

  3. normalization;

  4. centering;

  5. phase flipping;

  6. contrast inversion;

  7. thresholding.

This order matters. For example, inverting contrast before thresholding may produce a different result than thresholding before inversion. Users should therefore select operations with the intended sequence in mind.

## Interpreting the Result

The output particles should be interpreted as processed versions of the input particles.

Preprocessing can improve consistency, remove artifacts, and prepare images for later algorithms. However, it can also remove useful signal or introduce bias if parameters are inappropriate.

The processed particles should be inspected visually, especially when using thresholding, contrast inversion, phase randomization, or aggressive dust removal.

## Practical Recommendations

Use dust removal to suppress isolated extreme pixels.

Use normalization for most particle-processing workflows, especially before classification or alignment.

Use Ramp normalization when background gradients are visible.

Use phase flipping only when CTF metadata are reliable and when the workflow requires phase-flipped particles.

Use contrast inversion only after confirming the expected contrast convention.

Use thresholding cautiously and inspect the output.

Apply only the operations that are needed. Avoid unnecessary preprocessing that may alter the particle signal.

## Final Perspective

Preprocess Particles is a general particle-cleaning and normalization protocol.

For biological users, its value is that it prepares particle images for later processing by standardizing intensity, correcting contrast conventions, removing extreme artifacts, and optionally applying phase flipping or thresholding.

The protocol is a preprocessing utility. Its output should improve technical consistency, but it should always be checked before being used in major downstream steps.

NORM_NEW = 1
NORM_OLD = 0
NORM_RAMP = 2
centerStep(args, changeInserts)[source]
invertStep(args, changeInserts)[source]
normalizeStep(args, changeInserts)[source]
phaseFlipStep(args, changeInserts)[source]
randomizeStep(args, changeInserts)[source]
removeDustStep(args, changeInserts)[source]
sortImages(outputFn, outputMd)[source]
thresholdStep(args, changeInserts)[source]
class xmipp3.protocols.protocol_preprocess.protocol_preprocess.XmippProtPreprocessVolumes(**kwargs)[source]

Bases: XmippProcessVolumes

Preprocesses 3D volumes using Xmipp tools to prepare them for further analysis. Operations include: normalization, change hand, change icosahedral orientation, randomize phase, symmetry, symmetry group, aggregation mode, wrap, apply Laplacian, mask volume. adjust gray value, segment, normalize background, invert contrast and threshold.

AI Generated

## Overview

The Preprocess Volumes protocol applies a sequence of optional preprocessing operations to one volume or to a set of volumes.

Volume preprocessing is useful for changing map handedness, changing icosahedral conventions, randomizing phases, applying symmetry, denoising, adjusting gray levels, segmenting the molecule, normalizing the background, inverting contrast, or thresholding voxel values.

The protocol can be used as a general preparation tool before map comparison, alignment, validation, visualization, masking, subtraction, or refinement.

The main output is a processed volume or processed set of volumes.

## Inputs and General Workflow

The input can be a single volume or a set of volumes.

The selected operations are applied sequentially. Each operation acts on the result of the previous operation. For volume sets, the operation is applied to each volume in the set whenever applicable.

The protocol writes the processed volume or volume set and registers it as the output.

## Input Volumes

The Input volumes parameter defines the map or maps to be processed.

The input may be a single volume or a set of volumes.

The protocol does not decide which operations are biologically appropriate. The user selects the intended preprocessing steps and should inspect the result afterwards.

## Change Hand

The Change hand option applies a mirror transformation along the X axis.

This changes the handedness of the map.

Changing hand is useful when a reconstruction is known to have the wrong handedness or when comparing maps that use different hand conventions. It should not be applied casually, because it changes the chirality of the structure.

## Change Icosahedral Orientation

The Change icosahedral orientation option converts a volume from one standard icosahedral orientation convention to another.

The user selects the source convention in from and the target convention in to. The available conventions are:

  • i1;

  • i2;

  • i3;

  • i4.

This option is useful when maps with icosahedral symmetry need to be converted between Xmipp-supported symmetry conventions.

## Randomize Phases

The Randomize phases option randomizes Fourier phases beyond a selected resolution.

The Maximum Resolution parameter defines the resolution, in angstroms, beyond which phases are randomized.

This operation can be useful for control experiments or for suppressing high-frequency phase information beyond a chosen limit. It should be used with a clear validation or preprocessing purpose.

## Symmetrize

The Symmetrize option applies a symmetry group to the input volume.

The Symmetry group parameter defines the Xmipp symmetry to apply. The user should provide a valid symmetry group, such as a cyclic, dihedral, or icosahedral group. If no symmetry should be applied, the Symmetrize option should be disabled rather than setting the group to c1.

The protocol validates that c1 is not used as the symmetry group when symmetrization is requested.

## Aggregation Mode

The Aggregation mode parameter controls how symmetrized copies are combined.

There are two options:

Average averages the symmetry-related copies.

Sum sums them.

Average is usually appropriate when the goal is to produce a symmetrized map with comparable intensity scale. Sum may be useful in technical workflows where the accumulated density is desired.

## Wrap

The Wrap option controls whether density is wrapped around the box during symmetrization.

When enabled, wrapping is allowed. When disabled, the protocol uses a non-wrapping behavior.

Disabling wrap can help avoid artificial density appearing on the opposite side of the box when transformed density crosses a boundary.

## Mask Volume for Symmetrization or Laplacian

The Mask volume parameter can be used with symmetrization or Laplacian filtering.

For symmetrization, the mask can restrict the region used during the operation. For Laplacian filtering, it can provide a mask for the denoising step.

The mask should be in the same coordinate frame as the input volume.

## Apply Laplacian

The Apply Laplacian option applies a Laplacian-like denoising operation using Xmipp filtering.

The protocol uses a Retinex-style parameter internally and can optionally use a volume mask.

This operation is intended to enhance or denoise volume features, but it should be inspected carefully because filtering can alter map appearance.

## Adjust Gray Values

The Adjust gray values option adjusts the volume gray values so that they are compatible with a set of projection images.

The user provides Set of particles, which may be a set of particles, a set of averages, or a set of 2D classes. The protocol uses up to 200 images for the adjustment.

If the images already have projection alignment, those alignments are used. If not, the protocol estimates orientations against the input volume using significant alignment, with CPU or GPU execution depending on the settings.

The adjustment then modifies volume gray levels according to the image set.

## Images for Gray-Value Adjustment

The Set of particles parameter provides the images used for gray-level adjustment.

These images should have the final pixel size and final image size expected for the model.

The image set should correspond to projections of the same structure. If the images are unrelated, poorly aligned, or strongly heterogeneous, the gray-level adjustment may be unreliable.

## Symmetry Group for Significant Alignment

The Symmetry group parameter under gray-value adjustment defines the symmetry used when assigning orientations to the input images.

If no symmetry is present, use c1.

This setting is used only when the input images do not already have projection alignment and the protocol must estimate orientations for the adjustment step.

## GPU Execution for Significant Alignment

The protocol includes hidden GPU settings for the significant-alignment step used during gray-value adjustment.

If GPU execution is requested, the protocol checks that the required Xmipp CUDA programs are available. If they are not found, validation reports an error.

GPU execution can accelerate orientation assignment when adjustment images do not already have projection alignment.

## Segment

The Segment option separates the molecule from the background.

The protocol first creates a mask using the selected segmentation method, then applies that binary mask to the volume.

Segmentation is useful when the user wants to remove background or keep only the molecular region.

## Segmentation Type

The Segmentation Type parameter controls how the segmentation mask is created.

The available options are:

Voxel mass, where the user provides the target number of voxels.

Aminoacid mass, where the user provides an approximate number of amino acids.

Dalton mass, where the user provides the molecular mass in daltons.

Automatic, where the protocol uses Otsu thresholding.

The Molecule Mass parameter is used for all segmentation types except Automatic.

## Normalize Background

The Normalize background option normalizes the volume background so that it has zero mean and standard deviation 1.

The Mask Radius parameter defines the radius, in pixels, used to identify the region outside the molecule as background. If it is set to -1, the protocol uses half the size of the volume.

The protocol validates that the radius is not larger than the allowed volume half-size.

## Invert Contrast

The Invert contrast option multiplies the volume values by -1.

This can be useful when the map contrast convention is opposite to what a downstream protocol expects.

Because it changes the sign of density, it should be used only when the contrast convention is clearly known.

## Threshold

The Threshold option replaces selected voxel values according to a threshold rule.

The selection can be:

abs_below, selecting voxels whose absolute value is below the threshold;

below, selecting voxels below the threshold;

above, selecting voxels above the threshold.

Selected voxels are then substituted according to the chosen substitution mode.

## Substitute By

The Substitute by parameter controls how threshold-selected voxels are replaced.

The options are:

value, replacing selected voxels with the user-provided fill value;

binarize, converting selected and non-selected voxels into a binary representation;

avg, replacing selected voxels by the average of the non-selected voxels.

This option determines whether thresholding behaves as clipping, binarization, or background replacement.

## Operation Order

The selected operations are applied in the following order:

  1. change hand;

  2. change icosahedral orientation;

  3. randomize phases;

  4. symmetrize;

  5. Laplacian filtering;

  6. gray-value adjustment;

  7. segmentation;

  8. background normalization;

  9. contrast inversion;

  10. thresholding.

The order matters. For example, segmenting before normalizing may produce a different result than normalizing before segmenting.

## Output Volumes

The main output is the processed volume or volume set.

For a single input volume, the output is one processed map. For a set of volumes, the output contains the processed version of each input map.

The output can be used in downstream protocols such as alignment, comparison, masking, filtering, subtraction, validation, or visualization.

## Validation Rules

The protocol checks that an input volume or volume set has been provided.

If background normalization is selected, the background radius must be valid for the input volume size.

If symmetrization is selected, the symmetry group must not be c1. To avoid symmetrization, the user should disable the Symmetrize option.

If GPU execution is requested for significant alignment and the required CUDA programs are not available, the protocol reports a validation error.

## Interpreting the Result

The output should be interpreted as a processed version of the input volume or volumes.

Some operations are purely geometrical, such as changing hand or icosahedral orientation. Others change intensities, such as normalization, gray-value adjustment, thresholding, or contrast inversion. Others change structural support, such as segmentation or masking.

Because these operations can strongly affect map appearance, the output should always be inspected together with the original input.

## Practical Recommendations

Use change hand only when the map handedness is known to be wrong.

Use icosahedral orientation conversion only for maps with icosahedral symmetry and known convention differences.

Use symmetrization only when symmetry is biologically justified.

Use gray-value adjustment when a map must be made compatible with a set of projection images.

Use segmentation and thresholding cautiously, especially when weak or flexible density is important.

Use background normalization to standardize maps before downstream analysis.

Inspect the output after each major preprocessing workflow. When uncertain, run separate preprocessing protocols for individual operations so that their effects can be checked independently.

## Final Perspective

Preprocess Volumes is a general map-preparation protocol.

For biological users, its value is that it gathers several common volume preprocessing operations in one place: changing hand, changing symmetry orientation, randomizing phases, symmetrizing, denoising, adjusting gray values, segmenting, normalizing, inverting contrast, and thresholding.

The protocol is powerful but should be used carefully. Each selected operation changes the map in a specific way, and biological interpretation should always be based on the processed output together with the original map and the processing context.

AGG_AVERAGE = 0
AGG_SUM = 1
SEG_AMIN = 1
SEG_AUTO = 3
SEG_DALTON = 2
SEG_VOXEL = 0
adjustStep(isFirstStep, changeInserts)[source]
changeHandStep(args, changeInserts)[source]
invertStep(args, changeInserts)[source]
laplacianStep(args, changeInserts)[source]
normalizeStep(args, changeInserts)[source]
projectionStep(changeInserts)[source]
randomizeStep(args, changeInserts)[source]
removeDustStep(args, changeInserts)[source]
rotateIcoStep(args, changeInserts)[source]
segmentStep(args, changeInserts)[source]
symmetrizeStep(args, changeInserts)[source]
thresholdStep(args, changeInserts)[source]