Skip to main content
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian
  • Log in
  • My Cart

Main menu

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses
  • Submit
  • About
    • Editorial Board
    • PNAS Staff
    • FAQ
    • Accessibility Statement
    • Rights and Permissions
    • Site Map
  • Contact
  • Journal Club
  • Subscribe
    • Subscription Rates
    • Subscriptions FAQ
    • Open Access
    • Recommend PNAS to Your Librarian

User menu

  • Log in
  • My Cart

Search

  • Advanced search
Home
Home

Advanced Search

  • Home
  • Articles
    • Current
    • Special Feature Articles - Most Recent
    • Special Features
    • Colloquia
    • Collected Articles
    • PNAS Classics
    • List of Issues
  • Front Matter
  • News
    • For the Press
    • This Week In PNAS
    • PNAS in the News
  • Podcasts
  • Authors
    • Information for Authors
    • Editorial and Journal Policies
    • Submission Procedures
    • Fees and Licenses

New Research In

Physical Sciences

Featured Portals

  • Physics
  • Chemistry
  • Sustainability Science

Articles by Topic

  • Applied Mathematics
  • Applied Physical Sciences
  • Astronomy
  • Computer Sciences
  • Earth, Atmospheric, and Planetary Sciences
  • Engineering
  • Environmental Sciences
  • Mathematics
  • Statistics

Social Sciences

Featured Portals

  • Anthropology
  • Sustainability Science

Articles by Topic

  • Economic Sciences
  • Environmental Sciences
  • Political Sciences
  • Psychological and Cognitive Sciences
  • Social Sciences

Biological Sciences

Featured Portals

  • Sustainability Science

Articles by Topic

  • Agricultural Sciences
  • Anthropology
  • Applied Biological Sciences
  • Biochemistry
  • Biophysics and Computational Biology
  • Cell Biology
  • Developmental Biology
  • Ecology
  • Environmental Sciences
  • Evolution
  • Genetics
  • Immunology and Inflammation
  • Medical Sciences
  • Microbiology
  • Neuroscience
  • Pharmacology
  • Physiology
  • Plant Biology
  • Population Biology
  • Psychological and Cognitive Sciences
  • Sustainability Science
  • Systems Biology
Research Article

Algorithmic framework for X-ray nanocrystallographic reconstruction in the presence of the indexing ambiguity

Jeffrey J. Donatelli and James A. Sethian
PNAS January 14, 2014 111 (2) 593-598; https://doi.org/10.1073/pnas.1321790111
Jeffrey J. Donatelli
aDepartment of Mathematics and
bLawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
James A. Sethian
aDepartment of Mathematics and
bLawrence Berkeley National Laboratory, University of California, Berkeley, CA 94720
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: sethian@math.berkeley.edu
  1. Contributed by James A. Sethian, November 21, 2013 (sent for review September 23, 2013)

  • Article
  • Figures & SI
  • Info & Metrics
  • PDF
Loading

Significance

X-ray nanocrystallography is a powerful imaging technique which is able to determine the atomic structure of a macromolecule from a large ensemble of nanocrystals. Determining structure from this ensemble is challenging because the images are noisy, and individual crystal sizes, orientations, and incident photon flux densities are unknown. Additionally, lattice symmetries may lead to orientation ambiguities. Here, we show how to determine crystal size, incident photon flux density, and crystal orientation from noisy data. We also demonstrate that these data can be used to perform reconstruction without extra experimental requirements, atomicity assumptions, or knowledge of similar structures.

Abstract

X-ray nanocrystallography allows the structure of a macromolecule to be determined from a large ensemble of nanocrystals. However, several parameters, including crystal sizes, orientations, and incident photon flux densities, are initially unknown and images are highly corrupted with noise. Autoindexing techniques, commonly used in conventional crystallography, can determine orientations using Bragg peak patterns, but only up to crystal lattice symmetry. This limitation results in an ambiguity in the orientations, known as the indexing ambiguity, when the diffraction pattern displays less symmetry than the lattice and leads to data that appear twinned if left unresolved. Furthermore, missing phase information must be recovered to determine the imaged object’s structure. We present an algorithmic framework to determine crystal size, incident photon flux density, and orientation in the presence of the indexing ambiguity. We show that phase information can be computed from nanocrystallographic diffraction using an iterative phasing algorithm, without extra experimental requirements, atomicity assumptions, or knowledge of similar structures required by current phasing methods. The feasibility of this approach is tested on simulated data with parameters and noise levels common in current experiments.

Although conventional X-ray crystallography has been used extensively to determine atomic structure, it is limited to objects that can be formed into large crystal samples Graphic. An appealing alternative, made possible by recent advances in light source technology, is X-ray nanocrystallography, which is able to image structures resistant to large crystallization, such as membrane proteins, by substituting a large ensemble of easier to build nanocrystals, typically Graphic, often delivered to the beam via a liquid jet (1⇓⇓⇓⇓–6) (Fig. 1). However, the beam power required to retrieve sufficient information destroys the crystal, hence ultrafast pulses (≤70 fs) are required to collect data before damage effects alter the signal. Using nanocrystals introduces several challenges. Due to the small crystal size, Bragg peaks are smeared out, and there is noticeable signal between peaks. Typically, only partial peak reflections are measured, resulting in reduced intensities. Variations in crystal size and incident photon flux density, unknown orientations, shot noise, and background signal from the liquid and detector add additional uncertainty to the data.

Fig. 1.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 1.

Liquid jet (blue) delivers nanocrystal samples to the X-ray beam (red). Wide- and small-angle diffraction data are collected using front and rear detectors.

If crystal orientations were known, noise and variation in the peak measurements could be averaged out, and the data could be inverted to retrieve the object’s electron density. Although autoindexing techniques can be used to determine crystal orientation up to lattice symmetry from the location of a sufficient number of Bragg peaks, they typically face difficulties in the presence of partial and non-Bragg reflections common in nanocrystal diffraction images. Furthermore, these techniques only narrow down orientation to a list of possibilities when the diffraction pattern has less symmetry than the lattice, leading to an ambiguity in the image orientation, known as the indexing ambiguity. Current methods of processing the diffraction data are largely based on averaging out the data variance over several images (1⇓⇓⇓⇓⇓⇓–8). However, if the data are processed without resolving the indexing ambiguity then they will appear to be perfectly twinned, i.e., averaged over multiple orientations. Although there has been some success in determining structure from perfectly twinned data, reconstruction is often infeasible without a good initial atomic model of the structure.

We present an algorithmic framework for X-ray nanocrystallographic reconstruction which is based on directly reducing data variance and resolving the indexing ambiguity. First, we design an autoindexing technique that uses both Bragg and non-Bragg data to compute precise orientations, up to lattice symmetry. Crystal sizes are then determined by performing a Fourier analysis around Bragg peak neighborhoods from a finely sampled low-angle image, such as from a rear detector (Fig. 1). Next, we model structure factor magnitudes for each reciprocal lattice point with a multimodal Gaussian distribution, using a multistage expectation maximization algorithm which simultaneously scales and models the data. These multimodal models are used to build a weighted graph which models the structure factor magnitude concurrency. We formulate the solution to the indexing ambiguity problem as finding the maximum edge weight clique in this graph, which can be solved efficiently via a greedy approach. Finally, we demonstrate the feasibility of solving the phase problem using iterative phase retrieval. Whereas several of the presented methods rely on the use of nanocrystals, we note that the scaling–multimodal analysis and indexing ambiguity resolution steps can also be applied to larger crystals Graphic.

Formulation

In X-ray crystallography, diffraction patterns are collected from a periodic crystal made up of the target object. The 3D crystal lattice structure may be described by its Bravais lattice characteristic Graphic, and its associated infinite lattice Graphic. We define the lattice rotational symmetry group Graphic to be the set of rotation operators which preserve the lattice structure Graphic.

Each lattice has both a Dirac comb Graphic, which is a sum of Dirac delta functions supported on the lattice points, and a dual Graphic, known as the reciprocal lattice, given by the support of the Dirac comb’s Fourier transform* Graphic. The Bravais vectors of the reciprocal lattice are given by† Graphic.

In practice, a crystal lattice Graphic consists of only a finite part of its associated infinite lattice. In this case, the associated Dirac comb’s Fourier transform Graphic, known as the shape transform Graphic, is no longer a sum of delta functions, but is instead a smeared-out version of Graphic. To simplify the discussion, we will be assuming that the finite crystal lattice can be described as a box with Nj unit cells in the direction of hj, i.e., Graphic. Here, the squared norm of its associated shape transform is given byEmbedded ImageWe note that the methods presented in the following sections can be extended to more general crystal shapes, but possibly at the cost of losing a simple closed-form expression for the shape transform.

The electron density Graphic of a crystal with periodic units on the lattice points of Graphic can be expressed in terms of the electron density ρ of one of its periodic unit cells by Graphic The space group of a crystal Embedded Image introduces another form of symmetry on its diffraction pattern, described by the Laue rotational symmetry group Graphic.

The image of diffraction intensities Graphic due to elastic scattering from a crystal with unit cell electron density Graphic and orientation Graphic, using a fully coherent X-ray beam with wavelength λ and incident photon flux density J, at a detector with pixel size dx at distance D from the interaction point and normal to the beam, is described byEmbedded Imagewhere Graphic is the electron cross-section, Graphic is a polarization factor, Graphic, Graphic is the solid angle subtended by a pixel, and Graphic. For elastic scattering, the values of Graphic are often called the structure factors.

For large crystals, the shape transform approaches the Dirac comb of the reciprocal lattice, up to a constant factor, and the diffraction images consist of a series of bright spots, known as Bragg peaks, concentrated at reciprocal lattice points. However, for nanocrystallography, small crystal sizes spread the shape transform, and measurements close to, but not directly at, a Bragg peak, known as partial reflections, have a decreased measured intensity. Additionally, the signal at pixels corresponding to lines in between adjacent reciprocal lattice points is often noticeable in nanocrystal diffraction images.

The goal of X-ray nanocrystallography is to determine the unit cell electron density ρ from a large ensemble of diffraction images, which vary in orientation, incident photon flux density, and crystal size and are corrupted with noise. Here we will focus on the case when Graphic, which leads to the indexing ambiguity.‡

Autoindexing

Commonly used autoindexing methods, e.g., refs. 9, 10, can accurately determine a lattice’s unit cell, given by the Bravais vectors in some reference configuration, using a large ensemble of images. However, the orientation information that these methods compute might not be accurate enough to be used in the evaluation of the shape transform, especially in the presence of non-Bragg spots and low Bragg peak counts. Starting from this unit cell information, we devise an algorithm which uses both Bragg and non-Bragg data to generate precise orientations, up to lattice symmetry.

Bravais Characteristic Vector Calculation.

In nanocrystallography images, the strongest signal will occur at reciprocal lattice points ξ, which satisfy Graphic if h is a Bravais vector. However, signal along lattice edges, i.e., lines connecting adjacent reciprocal lattice points, is also commonly noticeable. If ξ is a reciprocal lattice edge point then Graphic for two of the Bravais vectors and can be anything in Graphic for the remaining vector. The set B of both types of points appear as bright spots in the images, and can be located through thresholding. For a given image, a direction h is then likely to be a Bravais vector for the rotated lattice if Graphic is close to 1 for most Graphic. In particular, when searching for the jth Bravais vector Graphic, where Graphic, we attempt to filter out the non-Bragg spots which disagree with dj by looking for a direction d which solvesEmbedded Imagewhere Graphic is the set of Graphic points in B with the largest value of Graphic. If multiple Bravais vectors have the same length, we search for multiple solutions approximately separated by the known relative angles between these vectors. We generate search directions with an approximately uniform sampling of the half unit sphere (11). After locating the initial directions, we repeat the process on a finer set of sample directions in a restricted angular range around the previously computed directions. If a direction is not found, we use the known relative angles between the other two directions to narrow the search for, or directly deduce, the missing direction.

Lattice Orientation Calculation.

Once Bravais directions Graphic for the rotated lattice associated with an image are located, with the sign convention that yields the correct angles between vectors and Graphic, we use the known reference configuration Graphic to compute an approximation Graphic to the orientation matrix. If Graphic is far from unity then the autoindexing procedure failed to compute an accurate orientation and we thus reject Graphic. We then find the closest rotation matrix by computing the singular value decomposition Graphic and setting Graphic, which is used as the approximation to the image orientation up to lattice symmetry, i.e., Graphic where R is the full orientation and Graphic.

Crystal Size Determination

To compute accurate structure factor magnitudes from measured intensities, the squared magnitude of the shape transform must be divided out of the intensity measurements in Eq. 2. Near a Bragg peak, the shape transform grows quadratically with the crystal size, which often varies up to an order of magnitude in each dimension over the nanocrystal ensemble. We determine these crystal sizes by analyzing intensities around Bragg peaks in low-angle images (Fig. 2) sampled at least twice the Nyquist rate for the crystal, i.e., the pixel spacing is at most Graphic, where W is the width of the crystal. A Fourier analysis of these intensities reveals the crystal sizes.

Fig. 2.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 2.

Example of a low-angle image. The profile of the shape transform around the Bragg peaks (circled) can be used to determine the crystal sizes.

Fourier Analysis of the Shape Transform.

For an image I with orientation§ R, consider its restriction Ir to a small neighborhood Embedded Image centered at a low-angle Bragg peak with detector coordinates Graphic corresponding to the reciprocal lattice point Embedded Image, where Graphic is small. In Embedded Image, by taking a linear approximation of q and using the translation invariance of S on Embedded Image, the intensities are approximated, up to a constant C, byEmbedded Imagewhere Graphic and Graphic. If we denote Graphic, then Eq. 4 becomes the restriction of G to the rotated plane Graphic, whose Fourier transform,¶ from the Fourier projection slice theorem and the Wiener–Khinchin theorem, is approximately the X-ray projected autocorrelation of Graphic:‖Embedded ImageNote that the support of this projected autocorrelation is given by the Minkowski sum of the rotated projected crystal, i.e., supp Graphic.

If we approximate** the crystal lattice as Graphic, where Graphic are the crystal sizes for each Bravais direction, then the convex hull of the projected autocorrelation Graphic, where Conv(X) is the convex hull of the set X, can be expressed asEmbedded ImageBy computing Graphic, we can deduce the crystal sizes by analyzing H as long as none of the rotated Bravais vectors Graphic is orthogonal to the detector. In general, the boundary of H consists of a series of line segments, with three normals Graphic along with their three antiparallel directions, which can be found by computing the convex hull of the rotated projected autocorrelated unit cell, i.e., the right-hand side of Eq. 6 with Graphic for each j. In the direction ni, the extent of the convex hull Graphic must be equal to that of the unprojected crystal, i.e., Graphic. Therefore, by defining the matrix Graphic, the crystal sizes N can be retrieved by solving Graphic, where Graphic. If the image does not directly pass through a reciprocal lattice point, this analysis is still valid; however, the Fourier-transformed images may contain oscillations, which grow with distance to the lattice point.

Image Segmentation.

To retrieve the crystal sizes via the methods of the previous section, we require an estimate for the support of the projected autocorrelated crystal from the Fourier transformed images, i.e., we need to segment the support from the noisy background (Fig. 3). We begin the segmentation by initializing a set H of pixels whose Graphic value is greater than a fixed percentage of the largest†† value. Then, we traverse a sorted list of the remaining values, adding pixels to H until one reaches a point more than some threshold, typically a few pixels, away from all of the pixels currently in H, suggesting that one has reached the end of the support and has begun to see the oscillations from the background.‡‡

Fig. 3.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 3.

The shape transform (Left) around a low-angle peak is Fourier transformed to reveal the projected autocorrelated crystal (middle), which is then segmented (Right).

Structure Factor Magnitude Modeling

Once the lattice orientations Graphic and crystal sizes Nm are known, we can use Eq. 2 to compute an approximation Graphic to the structure factor square magnitudes from the image Im, but only up to a constant factor because they are scaled by the unknown incident photon flux density Jm, which varies between images. Furthermore, due to the indexing ambiguity, one only knows the corresponding reciprocal space coordinates associated with the values of Graphic up to the crystal lattice symmetry, i.e., the possible structure factor magnitudes for each point take the form of a multimodal distribution. Moreover, these two problems are strongly coupled together: We cannot perform the scaling correction unless we know what modes to scale to and the modes are indistinguishable in the unscaled data set. Hence, we must simultaneously determine both the scaling and the multimodal parameters.

Processing the Data.

We approximate the structure factor squared magnitudes at the ith reciprocal lattice point§§ ξi from the mth image by computing an average over the neighboring ball Graphic with radius r:Embedded ImageWe discard any intensities below some fixed threshold from the above sum to prevent large errors from the division. Note that because we only know the orientation up to lattice symmetry, we also set¶¶ Graphic for every Graphic such that for some Graphic, Graphic. To simplify notation, we will assume that unmeasured values and corresponding indices are removed from the remaining sets and summations. To reduce the dependence of the standard deviation on the size of the intensities, we use two applications of variance stabilization (12), and instead work with Graphic.

Multimodal Analysis.

Assume for now that the structure factor magnitudes are already properly scaled. Due to the indexing ambiguity, at this point in the procedure we only know the orientations up to the lattice symmetry. Thus, for each reciprocal lattice point Graphic, values of Graphic could correspond to K different structure factor magnitudes, e.g., for elastic scattering Graphic. A histogram of Graphic for Graphic reveals K different peaks, smeared out as noise and parameter uncertainty are increased. Our goal is to detect these peaks and model the associated multimodal distribution.

To retrieve the set of possible structure factor magnitudes, we will model the computed values Graphic from each reciprocal lattice point Graphic with a multimodal Gaussian distribution. Specifically, the associated probability density functions can be expressed in terms of multiple Gaussian distributions with means Graphic, in monotonically increasing order, and standard deviations Graphic by Graphic, where Graphic.

Given Graphic, we determine its multimodal model through an expectation maximization algorithm. In particular, given an initial guess for model parameters Graphic and Graphic, we perform several iterations of the following:Embedded ImageHere, Graphic represents the probability that Graphic is drawn from the jth Gaussian mode. To initialize, we separate the data into K equal bins, define Graphic as the location in the jth bin with the greatest density, and set Graphic to be less than a typical bin size. Also, we perform outlier rejection by removing any Graphic in which Graphic is below a given threshold.

Scaling Correction.

In practice, variance in incident photon flux density, noise, and errors in autoindexing and crystal size determination smear out the peaks in the histogram, making them difficult to locate via expectation maximization (Fig. 4): Data must be scaled to properly model the structure factor magnitudes. To do so, we seek scaling factors which minimize the variance in the histograms and alternate this procedure with the expectation maximization step in Eq. 8.

Fig. 4.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 4.

(Left) Histogram of the possible unscaled variance stabilized structure factor magnitudes for a reciprocal lattice point corresponding to a fourfold indexing ambiguity. (Right) Histogram of the scaled data with multimodal Gaussian model (red).

We seek the scaling factor Graphic for the mth image by solvingEmbedded Imagewhose solution is given byEmbedded ImageOnce the Graphic are computed, they are normalized so that Graphic and then used to scale the images by replacing every Graphic with Graphic. Scaling is alternated with expectation maximization until convergence.

Resolving the Indexing Ambiguity

After the structure factor magnitude modeling, we know up to K possible structure factor magnitudes at each reciprocal lattice point. Resolving the indexing ambiguity amounts to correctly assigning one of these K values to each point. There are K equally valid solutions, related to each other by globally applying a rotation from Graphic. We first use the set of multimodal model parameters to construct a graph theoretic model of the structure factor magnitude concurrency, i.e., the probability that two given structure factor magnitudes occur within the same image. Then, we resolve the indexing ambiguity by finding the maximum edge weight clique of this graph with a greedy approach.

Graph Theoretic Modeling of Structure Factor Magnitude Concurrency.

Given the scaled variance stabilized structure factor magnitudes Graphic, means Graphic, and standard deviations Graphic for the mth image and jth mode at the ith reciprocal lattice point, we construct‖‖ a graph Graphic with vertices Graphic and edges E, where Graphic if and only if Graphic and Graphic, i.e., only one j can be selected at each reciprocal lattice point Graphic and each j can only be selected once among its twin-related coordinates Graphic, where Graphic. Consequently, choosing a consistent set of structure factor magnitudes, where each possible value appears exactly once, is equivalent to finding a maximal clique*** in this graph.†††

We define a directed weight Graphic on G as the ratio of the sum of concurrence probabilities over the sum of the occurrence probabilities, where we sum over the sets Graphic, consisting of all of the images which potentially measure intensities simultaneously at Graphic and Graphic:Embedded ImageW gives a measure of how likely the values of Graphic and Graphic are to occur together in one of the solutions. With no noise or error, if Graphic are distinct then W is exactly 1 when Graphic and Graphic are simultaneously part of one of the solutions and is 0 otherwise. However, if there are B values of k such that Graphic then W is Graphic. With noise present, the asymmetry of the weight function favors structure factor magnitudes with strong signal and whose histograms are highly multimodal, providing the most orientation information.

We now formulate the solution to the indexing ambiguity problem. We seek the maximal clique in G with maximum edge weight:Embedded Imagewhere Graphic is the set of all maximal cliques in G. If a sufficient number of images are used, then, in the absence of noise and error, the maximizer of Eq. 12 retrieves one of the exact solutions to the indexing ambiguity problem, i.e., the maximum edge weight clique Graphic assigns the correct structure factor magnitudes, up to a global rotation. For imperfect data, this maximizer will choose a solution which is most consistent with the observed structure factor magnitude concurrency.

Greedy Approach to the Maximum Edge Weight Clique Problem.

Even though the maximum edge weight clique problem is, in general, nondeterministic polynomial-time hard (13), when constructed from the indexing ambiguity problem via Eq. 12, we can solve it in quadratic time with a greedy approach. We initialize the clique Graphic with some starting vertex‡‡‡ Graphic and progressively add vertices that maximize the weight sum of the current clique. In practice, we remove any points whose associated multimodal distributions contain less than the maximum number of modes. For convenience, we use a single index for the vertices Graphic and we set Graphic if Graphic.§§§

Algorithm 1.

Embedded Image

The elements of the set Graphic returned by Algorithm 1 are pairs of the form Graphic, corresponding to choosing the jth modeled variance stabilized structure factor magnitude Graphic at the reciprocal lattice point Graphic. This induces the map Graphic where Graphic. With a sufficient number of images and no noise or error, Algorithm 1 retrieves an exact solution to the indexing ambiguity problem, always preferring a vertex Graphic with nonzero weighted edges connecting to all elements of the current clique, i.e., Graphic is only chosen if it corresponds to the same solution as the rest of clique. This approach remains robust with imperfect data, as it considers several pairs of measured intensities over all images to choose the structure factor magnitudes at any single point.

Orientation Calculation.

Although we can achieve a robust approximation of the complete structure factor magnitudes, accuracy can be improved by using this information to directly orient each image, and then averaging structure factor magnitudes for each corresponding reciprocal lattice point. For every image Graphic, we compute its full orientation Graphic, where Graphic solvesEmbedded ImageHere, A is the set of indices i where Graphic is measured by the image at the orientation Graphic and Graphic is a set of class representatives of the quotient group Graphic, i.e., Graphic consists of the identity and twinning operators. If there are at least two orientations close to the minimum value, then we reject the computed orientation. Once the orientations are known, we obtain the structure factor magnitudes by averaging the corresponding magnitudes computed from each scaled oriented image.

Phase Recovery

Once complete structure factor magnitudes are computed, one can determine missing phases and, thus, determine the electron density of a periodic unit with any applicable phasing method, e.g., refs. 14⇓⇓–17. Although these methods have been used extensively to determine structure in X-ray crystallography, each has limitations or introduces extra difficulties into the experimental setup. An appealing alternative is to directly deduce phases from Fourier magnitude data using an iterative phase retrieval technique, which only requires that Fourier magnitudes be sampled at a sufficient rate. Although infeasible in conventional crystallography, the signal from nanocrystals contains significant information between Bragg peaks, which may allow sampling at the required rates.

In general, such iterative phasing is possible if one samples the Fourier magnitudes at points of the form Graphic where Graphic (18, 19). In ref. 20 the feasibility of such an approach was demonstrated assuming that adequate signal was collected at each of the required points. However, the square magnitude of the shape transform at Graphic grows quadratically in the crystal size for each dimension in which Graphic. For nanocrystallography, this typically only results in noticeable signal at reciprocal lattice vertices and edges.¶¶¶,‖‖‖ While theory is lacking, this sampling density is sufficient in certain 3D cases (21). Alternatively, a recent approach uses Fourier magnitude information along with its gradient, assuming it can be accurately calculated, only at reciprocal lattice points (22). Here, we test the feasibility of iterative phasing using the Fourier magnitudes computed from our framework at reciprocal lattice vertices and edges.

Iterative Phase Retrieval.

Given a domain**** Graphic, Fourier magnitude values Graphic, and some support Graphic, iterative phase retrieval algorithms seek a function Graphic such that Graphic and Graphic for all Graphic. Given a set Ω of points where a has a recorded value, such algorithms typically make use of the projection operators Graphic, where Graphic if Graphic and Graphic otherwise, and Graphic, whereEmbedded Image

We alternate between several iterations of the error reducing algorithm (23): Graphic, and the hybrid input-output algorithm (24): Graphic, Graphic, to seek out the solution ρ. Furthermore, we couple these iterations with the Shrinkwrap method (25), which, starting with an initial guess such as the unit cell, updates an estimate of the true support T by convolving the current iterate with a Gaussian and then thresholding.

Results

We demonstrate our methodology by determining the structure of PuuE Allantoinase from simulated diffraction data using the atomic coordinates and crystal symmetry recorded in refs. 26, 27 with several different peak incident photon flux densities. Each data set consists of 33,856 diffraction images. We assume knowledge of the Bravais vector lengths and the space group, which, in practice, may be deduced from autoindexing information and reflection conditions.

The crystal displays P4 space-group symmetry, thus diffraction data are symmetric with respect to 90° rotation about the z axis and inversion, and has a twinning operator given by 180° rotation about the x axis, or, equivalently, the y axis.

Image orientations are generated by randomly sampling from a normal distribution of quaternions. Sizes were randomly generated with an average crystal width of μC = 2,948.97 Å and standard deviation σC = 982.99 Å. For each image, we generate random incident photon flux densities J, measured in photons per square Angstrom per pulse, from a peak density Graphic, via Graphic. We use experimental parameters†††† similar to ref. 3. Intensity values are computed via Eq. 2 along with shot and background noise, modeled with a Poisson distribution and a normal distribution, with a standard deviation of 1.3 photons per pixel (Figs. S1–S5).

Here we present statistics‡‡‡‡ for our framework (Tables 1–3) along with a reconstruction from the processed simulated data (Fig. 5). For further details, see ref. 28.

View this table:
  • View inline
  • View popup
Table 1.

Autoindexing performance

View this table:
  • View inline
  • View popup
Table 2.

Crystal size determination performance

View this table:
  • View inline
  • View popup
Table 3.

Orientation determination performance

Fig. 5.
  • Download figure
  • Open in new tab
  • Download powerpoint
Fig. 5.

Electron density contours of the exact solution (Left) and the computed solution with Graphic (Center) and overlay with the atomic model (Right).

Acknowledgments

We thank Stefano Marchesini for many valuable conversations. This research was supported in part by the Applied Mathematical Sciences subprogram of the Office of Energy Research, US Department of Energy (DOE) under Contract DE-AC02-05CH11231 and by the Division of Mathematical Sciences of the National Science Foundation, and used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the US DOE under Contract DE-AC02-05CH11231. J.A.S. was also supported by an Einstein Visiting Fellowship of the Einstein Foundation, Berlin. J.J.D. was also supported by a DOE Computational Science Graduate Fellowship.

Footnotes

  • ↵1To whom correspondence should be addressed. E-mail: sethian{at}math.berkeley.edu.
  • Author contributions: J.J.D. and J.A.S. designed research, performed research, analyzed data, and wrote the paper.

  • The authors declare no conflict of interest.

  • ↵*We will use Graphic to denote either the continuous or discrete Fourier transform of f, depending on context.

  • ↵†Graphic refers to the transpose of the inverse of A.

  • ↵‡Graphic designates the number of elements in A.

  • ↵§Because the shape transform is symmetric with respect to rotation by elements of Graphic, the use of orientations from autoindexing is sufficient here.

  • ↵¶Graphic may be slightly smeared out due to grid alignment effects.

  • ↵‖Graphic is the X-ray projection operator through the detector plane normal and A is the autocorrelation operator.

  • ↵**We note that alternative models of the finite lattice may be used here instead.

  • ↵††We exclude the origin which picks up all of the noise within the image.

  • ↵‡‡H gives unitless coordinates which should be scaled by Graphic for a restricted image with Graphic pixels of size Graphic. Crystal sizes are rejected if they are outside of a set range.

  • ↵§§To keep our notation compact, we are representing the reciprocal lattice points with a single index in place of the traditional Miller indices.

  • ↵¶¶Graphic is multivalued for an image which measures intensities at both Graphic and Graphic, Graphic.

  • ↵‖‖For known symmetry in the structure factor magnitudes (e.g., Friedel or Laue symmetry), we simplify G’s structure by merging corresponding symmetric vertices.

  • ↵***Graphic is a maximal clique if for all Graphic with no proper superset Graphic satisfying the same property.

  • ↵†††Graphic is the set difference of A and B.

  • ↵‡‡‡Graphic is typically chosen from a point with a highly multimodal distribution and strong signal.

  • ↵§§§In Algorithm 1, Embedded Image denotes the assignment operator.

  • ↵¶¶¶The lattice edge structure factor magnitudes may be computed via Eq. 7.

  • ↵‖‖‖One may also need to resolve the indexing ambiguity on the used non-Bragg data if it has less symmetry than the Laue group.

  • ↵****This is obtained by linearly mapping the reciprocal lattice onto a uniform grid.

  • ↵††††Graphic Å, Graphic, D = 68/141 mm for the front/rear detectors with 1,024 × 1,024 pixels, horizontal polarization, and Graphic between 0.01 and 100 times current experimental levels.

  • ↵‡‡‡‡Here we use the distance in Frobenius norm modulo Graphic.

  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1321790111/-/DCSupplemental.

References

  1. ↵
    1. Aquila A,
    2. et al.
    (2012) Time-resolved protein nanocrystallography using an X-ray free-electron laser. Opt Express 20(3):2706–2716.
    OpenUrlCrossRefPubMed
  2. ↵
    1. Boutet S,
    2. et al.
    (2012) High-resolution protein structure determination by serial femtosecond crystallography. Science 337(6092):362–364.
    OpenUrlAbstract/FREE Full Text
  3. ↵
    1. Chapman HN,
    2. et al.
    (2011) Femtosecond X-ray protein nanocrystallography. Nature 470(7332):73–77.
    OpenUrlCrossRefPubMed
  4. ↵
    1. Johansson LC,
    2. et al.
    (2012) Lipidic phase membrane protein serial femtosecond crystallography. Nat Methods 9(3):263–265.
    OpenUrlCrossRefPubMed
  5. ↵
    1. Kern J,
    2. et al.
    (2013) Simultaneous femtosecond X-ray spectroscopy and diffraction of photosystem II at room temperature. Science 340(6131):491–495.
    OpenUrlAbstract/FREE Full Text
  6. ↵
    1. Koopmann R,
    2. et al.
    (2012) In vivo protein crystallization opens new routes in structural biology. Nat Methods 9(3):259–262.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Spence JCH,
    2. Weierstall U,
    3. Chapman HN
    (2012) X-ray lasers for structural and dynamic biology. Rep Prog Phys 75(10):102601.
    OpenUrlCrossRefPubMed
  8. ↵
    1. White TA,
    2. et al.
    (2012) Crystfel: A software suite for snapshot serial crystallography. J Appl Cryst 45(2):335–341.
    OpenUrlCrossRef
  9. ↵
    1. Duisenberg AJM
    (1992) Indexing in single-crystal diffractometry with an obstinate list of reflections. J Appl Cryst 25:92–96.
    OpenUrlCrossRef
  10. ↵
    1. Steller I,
    2. Bolotovsky R,
    3. Rossmann MG
    (1997) An algorithm for automatic indexing of oscillation images using Fourier analysis. J Appl Cryst 30:1036–1040.
    OpenUrlCrossRef
  11. ↵
    1. Lovisolo L,
    2. da Silva EAB
    (2001) Uniform distribution of points on a hyper-sphere with applications to vector bit-plane encoding. IEEE Proc Vision Image Signal Process 148(3):187–193.
    OpenUrlCrossRef
  12. ↵
    1. Guan Y
    (2009) Variance stabilizing transformations of Poisson, binomial and negative binomial distributions. Stat Probab Lett 14:1621–1629.
    OpenUrl
  13. ↵
    1. Alidaee B,
    2. Glover F,
    3. Kochenberger G,
    4. Wang H
    (2007) Solving the maximum edge weight clique problem via unconstrained quadratic programming. Eur J Oper Res 181(2):592–597.
    OpenUrlCrossRef
  14. ↵
    1. Green DW,
    2. Ingram VM,
    3. Perutz MF
    (1954) The structure of haemoglobin. IV. Sign determination by the isomorphous replacement method. Proc. Roy. Soc. A. 225(1162):287–307.
    OpenUrlAbstract/FREE Full Text
  15. ↵
    1. Hauptman H,
    2. Karle J
    (1953) Solution of the Phase Problem I. The Centrosymmetric Crystal, No. 3 (American Crystallographic Association, New York).
  16. ↵
    1. Karle J
    (1980) Some developments in anomalous dispersion for the structural investigation of macromolecular systems in biology. Int J Quantum Chem Quantum Biol Symp 18(S7):357–367.
    OpenUrlCrossRef
  17. ↵
    1. Rossmann MG,
    2. Blow DM
    (1962) The detection of sub-units within the crystallographic asymmetric unit. Acta Crystallogr 15(1):24–31.
    OpenUrlCrossRef
  18. ↵
    1. Hayes MH
    (1982) The reconstruction of a multidimensional sequence from the phase or magnitude of its Fourier transform. IEEE Trans Acoust Speech Signal Process 30(2):140–154.
    OpenUrlCrossRef
  19. ↵
    1. Rosenblatt J
    (1984) Phase retrieval. Commun Math Phys 95:317–343.
    OpenUrlCrossRef
  20. ↵
    1. Spence JCH,
    2. et al.
    (2011) Phasing of coherent femtosecond X-ray diffraction from size-varying nanocrystals. Opt Express 19(4):2866–2873.
    OpenUrlCrossRefPubMed
  21. ↵
    1. Millane RP
    (1996) Multidimensional phase problems. J Opt Soc Am A Opt Image Sci Vis 13(4):725–734.
    OpenUrlCrossRef
  22. ↵
    1. Elser V
    (2013) Direct phasing of nanocrystal diffraction. Acta Crystallogr A 69:559–569.
    OpenUrlCrossRef
  23. ↵
    1. Gerchberg RW,
    2. Saxton WO
    (1972) A practical algorithm for the determination of the phase from image and diffraction plane pictures. Optik (Stuttg) 35:237–246.
    OpenUrl
  24. ↵
    1. Fienup JR
    (1978) Reconstruction of an object from the modulus of its Fourier transform. Opt Lett 3(1):27–29.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Marchesini S,
    2. et al.
    (2003) X-ray image reconstruction from a diffraction pattern alone. Phys Rev B 68(4):140101.
    OpenUrlCrossRef
  26. ↵
    1. Bernstein FC,
    2. et al.
    (1977) The Protein Data Bank: A computer-based archival file for macromolecular structures. J Mol Biol 112(3):535–542.
    OpenUrlCrossRefPubMed
  27. ↵
    1. Ramazzina I,
    2. et al.
    (2008) Logical identification of an allantoinase analog (puuE) recruited from polysaccharide deacetylases. J Biol Chem 283(34):23295–23304.
    OpenUrlAbstract/FREE Full Text
  28. ↵
    1. Donatelli JJ
    (2013) Reconstruction algorithms for x-ray nanocrystallography via solution of the twinning problem, PhD Thesis (University of California, Berkeley).
PreviousNext
Back to top
Article Alerts
Email Article

Thank you for your interest in spreading the word on PNAS.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Algorithmic framework for X-ray nanocrystallographic reconstruction in the presence of the indexing ambiguity
(Your Name) has sent you a message from PNAS
(Your Name) thought you would like to see the PNAS web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
X-ray nanocrystallographic reconstruction
Jeffrey J. Donatelli, James A. Sethian
Proceedings of the National Academy of Sciences Jan 2014, 111 (2) 593-598; DOI: 10.1073/pnas.1321790111

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Request Permissions
Share
X-ray nanocrystallographic reconstruction
Jeffrey J. Donatelli, James A. Sethian
Proceedings of the National Academy of Sciences Jan 2014, 111 (2) 593-598; DOI: 10.1073/pnas.1321790111
Digg logo Reddit logo Twitter logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Mendeley logo Mendeley
Proceedings of the National Academy of Sciences: 111 (2)
Table of Contents

Submit

Sign up for Article Alerts

Article Classifications

  • Physical Sciences
  • Applied Mathematics

Jump to section

  • Article
    • Abstract
    • Formulation
    • Autoindexing
    • Crystal Size Determination
    • Structure Factor Magnitude Modeling
    • Resolving the Indexing Ambiguity
    • Phase Recovery
    • Results
    • Acknowledgments
    • Footnotes
    • References
  • Figures & SI
  • Info & Metrics
  • PDF

You May Also be Interested in

Surgeons hands during surgery
Inner Workings: Advances in infectious disease treatment promise to expand the pool of donor organs
Despite myriad challenges, clinicians see room for progress.
Image credit: Shutterstock/David Tadevosian.
Setting sun over a sun-baked dirt landscape
Core Concept: Popular integrated assessment climate policy models have key caveats
Better explicating the strengths and shortcomings of these models will help refine projections and improve transparency in the years ahead.
Image credit: Witsawat.S.
Double helix
Journal Club: Noncoding DNA shown to underlie function, cause limb malformations
Using CRISPR, researchers showed that a region some used to label “junk DNA” has a major role in a rare genetic disorder.
Image credit: Nathan Devery.
Steamboat Geyser eruption.
Eruption of Steamboat Geyser
Mara Reed and Michael Manga explore why Yellowstone's Steamboat Geyser resumed erupting in 2018.
Listen
Past PodcastsSubscribe
Birds nestling on tree branches
Parent–offspring conflict in songbird fledging
Some songbird parents might improve their own fitness by manipulating their offspring into leaving the nest early, at the cost of fledgling survival, a study finds.
Image credit: Gil Eckrich (photographer).

Similar Articles

Site Logo
Powered by HighWire
  • Submit Manuscript
  • Twitter
  • Facebook
  • RSS Feeds
  • Email Alerts

Articles

  • Current Issue
  • Special Feature Articles – Most Recent
  • List of Issues

PNAS Portals

  • Anthropology
  • Chemistry
  • Classics
  • Front Matter
  • Physics
  • Sustainability Science
  • Teaching Resources

Information

  • Authors
  • Editorial Board
  • Reviewers
  • Librarians
  • Press
  • Site Map
  • PNAS Updates

Feedback    Privacy/Legal

Copyright © 2021 National Academy of Sciences. Online ISSN 1091-6490