xmipp3.protocols.protocol_denoise_particles module

class xmipp3.protocols.protocol_denoise_particles.XmippProtDenoiseParticles(**kwargs)[source]

Bases: ProtProcessParticles

Remove particles noise by filtering them. This filtering process is based on a projection over a basis created from some averages (extracted from classes). This filtering is not intended for processing particles. The huge filtering they will be passed through is known to remove part of the signal with the noise. However this is a good method for clearly see which particle are we going to process before it’s done.

AI Generated

## Overview

The Denoise Particles protocol produces a strongly filtered version of a particle set by projecting the particles onto a reduced basis learned from 2D class averages or averages supplied by the user. The goal is not to create particles for high-resolution reconstruction, but to obtain cleaner-looking images that make it easier to visually understand what kind of particles are present in the dataset.

In practical cryo-EM workflows, this protocol is mainly a visualization and inspection tool. It can help the user assess whether a set of particles is broadly consistent with a given 2D classification, whether the particles contain recognizable structural signal, or whether large fractions of the dataset look suspicious. Because the filtering is intentionally strong, it also removes part of the true signal together with the noise. For that reason, the output should not be interpreted as a faithful restoration of the experimental images.

For a biological user, the main value of this protocol is that it gives a simplified, less noisy view of the data, which is often useful before making decisions about subsequent processing.

## Inputs and General Workflow

The protocol requires two inputs: a set of particles to be denoised and a set of classes or averages used to construct the basis for denoising.

The input particles must already have alignment information relative to the chosen classes or averages. This is the standard situation after 2D classification methods such as CL2D or ML2D. This requirement is important because the denoising relies on the particles being expressed in a coordinate system that is consistent with the basis learned from the class information.

The workflow has two conceptual stages. First, the protocol builds a PCA-like basis from the supplied classes or averages. Second, each particle is projected onto a selected number of basis components, and the particle is reconstructed from that reduced representation. The reconstructed image is smoother and less noisy than the original one, but it is also more model-driven.

## The Role of the Input Classes or Averages

The classes or averages define the space of images in which the particles will be represented. In other words, the denoising is guided by the dominant patterns present in those reference images.

Biologically, this means that the result depends strongly on the quality and relevance of the chosen classes. If the input classes represent the real structural views present in the particles, the denoised images can become much easier to interpret. If the classes are poor, heterogeneous, or not representative of the particles, the denoising may produce misleading images.

This is therefore not a generic image denoiser. It is a class-guided denoising procedure, and its output reflects the image space spanned by the chosen classes or averages.

## Basis Construction

The protocol constructs a reduced basis using a rotational PCA procedure. The purpose of this basis is to capture the main modes of variation present in the input classes or averages.

Two parameters are especially relevant here. The maximum number of classes controls how many reference images are used to build the basis. The maximum number of PCA bases controls how many components are retained during basis construction.

In practice, using more classes can make the basis more representative of the structural diversity in the data, but also potentially more heterogeneous. Using fewer classes produces a simpler basis, which may be easier to interpret but may not capture all meaningful views.

Similarly, a larger number of PCA components allows a richer representation, while a smaller number enforces stronger simplification.

## Denoising by Projection onto the Basis

Once the basis has been computed, each particle is projected onto a chosen number of PCA components, and the image is reconstructed from that projection.

This is the central denoising step. By representing each particle with only the main components of the basis, much of the small-scale noise is suppressed. At the same time, image details that are not well represented by the basis may also disappear.

The key parameter here is the number of PCA bases on which to project. Smaller values lead to more aggressive denoising and smoother particles, but at the risk of oversimplifying them. Larger values preserve more detail, but also keep more noise.

A practical way to think about this is that the number of projection bases controls the balance between interpretability and faithfulness to the original data.

## Interpretation of the Output

The output is a new set of particles containing denoised images. These particles preserve the identity of the input particles, but their image content has been replaced by a filtered reconstruction derived from the learned basis.

These denoised particles are useful for: * visually inspecting the dominant particle appearance, * checking whether the dataset broadly matches the chosen classes, * identifying obvious mismatches or poor-quality subsets, * preparing exploratory visual summaries of particle content.

However, they should not be used as a substitute for the original particles in serious downstream processing such as high-resolution refinement or final reconstruction. The protocol documentation is explicit on this point: the filtering is strong and removes genuine signal together with noise.

From a biological perspective, the output is best interpreted as an illustrative representation of the particles, not as a more accurate

## Practical Recommendations

This protocol is best used after a meaningful 2D classification, when the user already has class averages that reflect the major views present in the dataset. In that situation, the denoised particles can provide a very intuitive picture of what the particle set contains.

A good starting point is to use a reasonably representative class set and a moderate number of PCA components. If the denoised particles still look too noisy, the number of projection components can be reduced. If they look oversmoothed or overly similar, it may be better to increase the number of components or revise the class set used to build the basis.

Users should be cautious when the dataset contains substantial heterogeneity. If the class set represents only one subset of the data, particles belonging to other states may be poorly represented and may be denoised into misleading images.

## Outputs and Their Interpretation

The protocol produces a new set called outputParticles, which contains the denoised version of the input particle images.

The output keeps the general metadata and identity of the original set, but the particle images now correspond to the filtered reconstructions. This makes it easy to compare the denoised particles with the originals or to use them for visual inspection in the same workflow context.

## Final Perspective

The Denoise Particles protocol is a guided image-simplification tool designed to help users look at noisy particle data through the lens of an existing class-based representation. Its main purpose is not to improve the dataset for reconstruction, but to make it easier to understand and inspect.

For most cryo-EM users, it should be viewed as a convenient visualization aid: useful for exploring, checking, and communicating what the particles look like, but not as a replacement for the original experimental images in quantitative processing.

createOutputStep()[source]
denoiseImages()[source]