Basic Usage
Here we will use the provided 6lyz.out file for this tutorial, created by GNOM. We will also show how to use the 6lyz.dat file (in case a GNOM formatted .out file is not available for your data). DENSS accepts a variety of file types, including 3-column ASCII text .dat files containing raw data or a smooth fit (columns q, I, error), 4-column ASCII text .fit files containing raw data and the smooth fit (columns q, I, error, fit), and GNOM .out files.
First, to simply run DENSS with default parameters (suitable for many cases), type at the command prompt:
$ denss.py -f 6lyz.out
The estimated maximum dimension, Dmax, is required for setting the size of the box. When possible, the Dmax is extracted from the file. This is possible for file types that store Dmax and that DENSS can parse, such as GNOM .out files and DENSS .fit files. If Dmax cannot be extracted from the file, DENSS can also automatically estimate Dmax. When using a standard .dat file containing raw data, DENSS will attempt to fit the data while estimating Dmax. In all cases, Dmax extracted from the file or estimated by DENSS can be overridden using the -d option as follows:
The box size DENSS creates is equal to Dmax*oversampling. Thus, this command tells DENSS to create a real space box with an edge length of 150 Å, since the default oversampling ratio is 3.
FAST, SLOW and MEMBRANE Modes
DENSS has convenient options for selecting some good defaults for the vast majority of cases. Three “modes” are now included as options to select these defaults, FAST mode, SLOW mode and MEMBRANE mode. The default mode is SLOW, since this is suitable for most cases and takes about 2-20 minutes or so on a modern processor*. If you have a relatively simple globular shape you can select FAST mode using the -m option of denss.py, which will take about 30 to 60 seconds*. If you have a membrane protein, or actually any object that might have negative contrast, you can use the MEMBRANE mode instead. Read below for more details on this mode.
*Note: When installing NumPy, if you install the latest versions with Anaconda, Numpy will often be installed with MKL. MKL allows multicore acceleration with NumPy. This provides a significant speed up for denss.py, particularly with processors with several cores.
SLOW mode will set the box size to be Dmax * oversampling, and will set the number of samples (N) to 64 by adjusting the voxel size. FAST mode will use 32 samples instead. Other options affected by these modes can be found on the Advanced Options page. All of the options can still be overridden when using these modes by explicitly setting them.
MEMBRANE mode, new as of version 1.4.9, disables the positivity restraint and thus allows for negative contrast. Similar to SLOW mode, MEMBRANE mode will use 64 samples. However, since positivity is turned off, shrink-wrap will start immediately. While this mode is designed for objects with negative contrast, it will work even in cases that do not have significant negative contrast. However, note that due to disabling the positivity restraint, density maps may be noisier than usual and occasionally require more reconstructions during the averaging step.
Since N must be an integer, the voxel size is adjusted to the nearest value which yields an integer value for N, even if you set an explicit voxel size. You can check the .log file and look for the line saying “Real space voxel size” to find out what the actual voxel size used was, to be distinguished from the line saying “Requested real space voxel size” which will tell you what the voxel size input to the program through the -v option was.
Other commonly used options include:
-v VOXEL, --voxel VOXEL Set desired voxel size, setting resolution of map -os, --oversampling OVERSAMPLING Sampling ratio (default 3.0) -n NSAMPLES, --nsamples NSAMPLES Number of samples along a single dimension. (sets voxel size, overridden by --voxel, default=64) --ne NE Number of electrons in object (for scaling final density map, default=10,000) -ncs NCS, --ncs NCS Rotational symmetry -ncs_steps NCS_STEPS [NCS_STEPS ...], --ncs_steps NCS_STEPS [NCS_STEPS ...] List of steps for applying NCS averaging (default=3000) -ncs_axis NCS_AXIS, --ncs_axis NCS_AXIS Rotational symmetry axis (options: 1, 2, or 3 corresponding to xyz principal axes) -s STEPS, --steps STEPS Maximum number of steps (iterations) -o OUTPUT, --output OUTPUT Output map filename (default basename of input file) -m MODE, --mode MODE Mode. F(AST) sets default options to run quickly for simple particle shapes. S(LOW) useful for more complex molecules. (default SLOW)
An important option for complex cases is -os, the oversampling ratio. For scattering profiles with a lot of features, higher oversampling in reciprocal space helps to ensure that the final density has a scattering profile that matches the data. A handy option, often invoked with -os, is the -n option to set the number of samples. This option will set the voxel size to a value that results in the requested N. The FFT procedure works most efficiently with N as a power of 2, so good numbers are 32, 64, 128, however any number will be fine (if an odd number is given, it is increased to the next even number). More samples does not necessarily increase the final resolution of the reconstruction due to the lack of information in the SAXS profile, but in complex cases may be required for ensuring accurate comparison of the calculated scattering profile to the experimental scattering profile in reciprocal space (see the Tips page for how to assess this). In such cases try more samples with appropriately increased -os. It will take much longer, but its pretty quick as is. In extreme cases 128 may be used, but rarely is more than that necessary or useful. Remember that the total array size grows as N3.
NCS Symmetry Averaging
Since v1.4.6, denss.py includes a new option for utilizing symmetry averaging to improve the accuracy of reconstructions. The NCS (non-crystallography symmetry) averaging procedure works by first aligning the principal axes of the map to the x, y, z axes, sorted from longest to shortest axis. Then the symmetry mate for the given rotational symmetry operator (simple N-fold rotation about the principal axis) is calculated (via interpolation), and the density value at each rotational position is set to be the average of all density values at those positions. To invoke NCS symmetry averaging, simply set the –ncs option. For example, to impose 2-fold symmetry type at the command prompt:
$ denss.py -f 6lyz.out -ncs 2
Now for lysozyme this doesn’t make sense, since 6LYZ does not have 2-fold symmetry, but is just shown for illustration. By default the symmetry averaging will be performed a few times, at step 3000, 5000, 7000, 9000. To impose NCS at alternative steps, set the –ncs_steps option. This option takes a space-separated list of steps. Selecting more steps will lead to a stronger NCS restraint, but with the consequence of imposing more bias (and sometimes artifacts of interpolation, i.e. smearing or stripes). Since the NCS is not imposed at every step, the reconstruction is not forced to be symmetrical. Also by default the symmetry axis is assumed to be the longest principal axis. However, in many cases the longest axis is not the symmetry axis. To select a different symmetry axis, invoke the –ncs_axis option. This option takes an integer, either 1, 2, or 3. By default this is set to 1, i.e. the first, or longest, axis. For example, to impose 3-fold symmetry at steps 3000, 5000, and 7000 about the shortest principal axis:
$ denss.py -f 6lyz.out -ncs 3 -ncs_steps 3000 5000 7000 -ncs_axis 3
Calculation Time
The speed of the program is pretty much entirely dictated by N. In some cases the algorithm will converge in fewer steps than in other cases, so it does vary some in that sense. For simple cases where N = 32, a single run of denss.py takes about 20 seconds or so on a modern processor (e.g. 2.5 GHz, Intel i5). For the default case where N = 64, it takes about 3 minutes; for N = 128 it takes over 30 minutes. However, as noted below, running multiple reconstructions (20 for simple cases, maybe 100 for complex cases) is required, and the averaging process tends to take longer than the reconstructions themselves.
Results
As the program runs, the current status will be printed to the screen like so:
$ denss.py -f 6lyz.out Step Chi2 Rg Support Volume ----- --------- ------- -------------- 349 8.35e+04 34.76 3375000
These values will update in line as the program progresses. “Step” is the current iteration, “Chi2” (or rather χ2) is the goodness of fit of the calculated intensity versus the interpolated experimental intensity. “Rg” is the radius of gyration calculated directly from the electron density map. “Support Volume” is the total volume in Å3 of the support, i.e. the voxels containing the particle which is determined by shrink-wrap (see below).
Additionally, when the –enforce_connectivity restraint is imposed, an additional integer number will be printed to the right of the Support Volume and the results line will be kicked down one line. This number refers to the number of “features” that the –enforce_connectivity option counted, i.e. the number of separate blobs. If you see this number is 1, then that means shrinkwrap already got rid of all the other disconnected blobs of density and it won’t have much effect. This happens more often for simple globular particles. However, for less globular particle shapes, there are often several disconnected features at this early stage of the reconstruction, so the –enforce_connectivity restraint eliminates all of the minor features, retaining only the feature with the greatest density. It will look something like this:
$ denss.py -f 6lyz.out Step Chi2 Rg Support Volume ----- --------- ------- -------------- 5999 5.62e+01 16.36 52234 1 8259 1.31e+00 14.34 42135
Some notes about these values:
In most cases, the χ2 value reported should only be used as a relative indicator of whether or not convergence is occurring. χ2 is very sensitive to accurate error estimates and can easily be orders of magnitude off in scale (particularly here as we are interpolating values and using smooth curves which do not adjust errors to account for oversampling). However as the reconstruction progresses, you should notice a steady decline in the reported value. Do not be concerned if the value fluctuates up and down (particularly around the step where the –enforce_connectivity option is invoked), as this is often part of the convergence process.
Towards the beginning of the reconstruction, Rg values may appear drastically off and may even be negative. This is simply because the Rg calculation does not understand the concept of periodic boundaries in the FFT and when multiple separate blobs are present the calculation is inaccurate. However, typically after the –enforce_connectivity option removes extra blobs in the support and the density gets recentered, the Rg should change (often drastically) to more reasonable values and proceed to converge until completion. Due to the random nature of the starting seed of the algorithm, multiple reconstructions often vary in final Rg by a few (maybe 5 or so) percent.
The “Support Volume” is only the volume of the support region and should not be confused as the actual volume of the particle. The “Support Volume” will always be larger than the volume of the particle since it is the volume of the voxels containing the particle. However it should give you some idea of an upper bound of the particle volume. To actually estimate particle volume from the density, you can open up the .mrc file in Chimera and open the “Measure Volume and Area” tool. This will calculate the volume of the particle at the given density threshold (which is user defined).
Output Files
Electron density maps are written in CCP4/MRC format (credit Andrew Bruno) and optionally as Xplor ASCII text format (with the –write_xplor option enabled). These files can be opened directly in some visualization programs such as Chimera and PyMOL. In particular, the PyMOL “volume” function is well suited for displaying these maps with density information displayed as varying color and opacity. Maps can be converted to other formats using tools such as the Situs map2map tool or the EMAN2 e2proc3d.py program.
Output files include:
output.mrc electron density map (MRC format) output_support.mrc final support volume formatted as unitary electron density map output_stats_by_step.dat statistics as a function of step number. three columns: chi^2, Rg, support volume output_map.fit The fit of the calculated scattering profile to the experimental data. Experimental data has been interpolated to the q values used for scaling intensities and I(0) has been scaled to the square of the number of electrons in the particle. Columns are: q(data), I(data), error(data), q(calc), I(calc) output_*.png If plotting is enabled, these are plots of the results. output.log A log file containing parameters for the calculation and summary of the results.