Averaging with EMAN2
DENSS reconstructs particles in an iterative fashion which begins by filling the entire grid of points with random values. As a result of this, each run of DENSS with identical input parameters yields a slightly different result. These results, while different at high resolution, should be similar at low resolution. Therefore, to determine the low-resolution density, one must run DENSS multiple times and average the results. This process can be performed easily with EMAN2. EMAN2 is primarily a package for reconstructing 3D volumes from electron microscopy data, however it is also just a general grayscale image processing suite geared for scientific purposes. This makes EMAN2 a high-quality, easy-to-use, and well-supported solution for aligning and averaging multiple DENSS reconstructions.
For convenience, provided with the DENSS download is a bash script that can be used to run the entire pipeline of multiple DENSS reconstructions and perform averaging with EMAN2. This script is called superdenss. By default superdenss will run everything in parallel on a multicore machine (by default sets number of cores to one less than available on the system, adjustable with the -j option). To run the individual reconstructions in parallel, superdenss uses GNU parallel. However if you don’t have GNU parallel installed, superdenss will run the individual reconstructions in a loop, but will still run EMAN2 in parallel.
To run superdenss with default parameters, type:
If this is the first time you have run superdenss using this input file, this will create a folder called “6lyz” containing the results. If you have run this command before with the same input file or that directory name already exists, superdenss will create a new directory and append incrementing numbers to the end of the directory name so that your previous results are not overwritten. If you would like to change the output filename prefix, set the -o option. For example, to change the output name to “lysozyme”, type:
If you would like to enter additional options for running the individual DENSS reconstructions, you can invoke them using the -i option, and surround all the options you would like to give to denss.py with one set of double quotes. For example, to force DENSS to use a different Dmax than in the GNOM file, type:
Options for superdenss should be given outside of the quotes, while options for denss.py should be given inside the quotes after the -i option. For example, to run denss.py with greater oversampling and more samples, type:
However, if you wanted to also run more reconstructions than 20 (the default for superdenss), you can invoke the -n superdenss option as follows:
This will run 100 reconstructions with denss.py while setting the Dmax to 55, the oversampling ratio to 5, and the number of samples to 64.
This will create a folder named “lysozyme” with all the output files for each of the 20 individual runs of DENSS, a file called “lysozyme_avg.mrc” containing the final averaged density, and a file called “lysozyme_fsc.txt” containing the Fourier Shell Correlation used to estimate resolution. Additionally, the resolution estimated from the FSC curve will be printed to the screen.
Enantiomer Generation and Selection
The latest updates to DENSS (version 1.0.2 or later) now include a new option in the superdenss script to deal with possible enantiomer ambiguity. Enantiomers (i.e. mirror images of particles) are ambiguous in solution scattering and yield identical scattering profiles. It is possible for denss.py to create different enantiomers based on the random seed that it starts with. Because of this, different enantiomers may get averaged together. For some particle shapes, this is not so much of an issue because the possible enantiomers may only be distinguishable at resolutions higher than can be reconstructed. However for more complex particle shapes, different enantiomers may be clearly distinguishable and yield incorrect results if averaged together. Previously, you were required to manually sort or classify these particles into different sets for each enantiomer, and subsequently perform independent averaging. However, there is now a new option in superdenss (the -e option) which will turn on the ability to generate each of the eight enantiomers (two reflections in each of the three axes, i.e. 23) and compare each enantiomer to the reference volume to select which enantiomer agrees best. Then the averaging procedure will continue as normal using only the best enantiomers. Note that this process takes significantly longer (>8x) than without the -e option invoked, which is why it is disabled by default.
To run superdenss in slow mode (suitable for most complex particle shapes) with enantiomer generation and selection enabled, simply add the -e option to superdenss. Here we will take a look at a particularly complex particle shape, SusF (PDB ID 4FE9). This protein was selected from the PDB as exhibiting the single highest ambiguity score as estimated by AMBIMETER (a score of 3.019). The complexity of the shape makes incorrect enantiomer selection a serious problem. It is also a good demonstration of the ability of DENSS to reconstruct complex shapes. Note the image below on the left shows a single reconstruction of denss.py that created the correct enantiomer. Shown in the middle is an example of a single reconstruction that selected the incorrect enantiomer. Shown on the right is the final reconstruction after generating and selecting for the correct enantiomers.
To have superdenss create each of the eight enantiomers and select the best one for averaging for each of the 20 reconstructions, add the -e option to superdenss as follows:
This will create a folder named “4FE9” containing the 4FE9_avg.mrc file with the fully averaged reconstruction, taking into account the best enantiomers. This runs denss.py in the default “slow” mode. This shows that for even quite complex cases, the default slow mode should work quite well in combination with enantiomer selection.
It should be noted that this procedure does not ensure that the final averaged reconstruction represents the actual correct enantiomer, it just ensures that all of the reconstructions used in the averaging procedure have the same handedness. There is no way to to select the correct enantiomer without some ancillary data, since enantiomers are ambiguous in solution scattering.
The averaging procedure works by aligning each reconstruction against a reference map. The reference map is calculated using a binary tree algorithm, where several pairs of maps are aligned and averaged in steps, ultimately generating a simple average. The Fourier Shell Correlation comparing each reconstruction to the reference is calculated, and the average of all FSC curves is calculated and saved in a file named “6lyz_fsc.dat”. This plot can be used to estimate resolution where the FSC curve falls below 0.5. Take the reciprocal of that x-axis value, and that is your estimated resolution in Å. For convenience this resolution is estimated and printed to the log file and screen for you. If the python module matplotlib is available, you can use the supplied fsc2res.py script to make a plot of the FSC and estimated resolution which will be saved to a png file. To estimate resolution by comparing with a known structure see denss.align.py and denss.calcfsc.py below and the Tips page. Note, this approach to estimating resolution is different than the calculation EMAN2 performs on its own (which compares averages of two halves of reconstructions).