Towards machine learning for heterogeneous inverse scattering in 3D microscopy

Zsolt-Alon Wertheimer; Zsolt-Alon Wertheimer; Chen Bar; Chen Bar; Anat Levin; Anat Levin

doi:10.1364/OE.447075

1. Introduction

Scattering refers to the propagation of light in non-uniform media, composed of small discrete scatterers, usually particles of varying refractive properties, such as water droplets dispersed in the sky, or cells in the tissue. As an incident light propagates through the medium, it interacts with scatterers multiple times, and each such interaction changes its shape. Scattering is commonly encountered when visible light interacts with a large variety of materials, for instance biological tissues, minerals, the atmosphere and clouds, cosmetics, and many industrial chemicals.

In material acquisition one attempts to image the scattering profiles of the material under multiple illumination and viewing conditions and use it to estimate the type, size and density of the composing particles. Knowledge of scattering parameters is valuable in many application settings. For example, in tissue imaging it can be used to detect tumors and classify them as malignant or non-malignant [1]; in blood analysis, to recover diagnostically important parameters such as red and white blood counts [2,3]; in material science and fabrication applications, to validate the fidelity of manufactured material samples [4]; and in flow cytometry and particle sizing applications, to infer the chemical composition of industrial nanodispersions [5].

In this work we target volumetric acquisition of heterogeneous materials, whose particle density is spatially varying. We aim to recover these densities in a microscopic scale, targeting mostly tissue imaging applications. To achieve a high resolution volumetric scan of the material, we choose to use a confocal imaging scan of the volume as an input to our estimation process. In a confocal scan, one successively illuminates the volume with a light cone which is focused at one spot, and images with a sensor focused at the same spot. The focusing spot is sequentially scanned throughout the volume. Confocal imaging is one of the highest resolution imaging modalities being used in microscopy. Yet, when trying to image deep into the material it is highly aberrated by scattering. In this work we aim to design a learning framework, which use this scattered light to infer material properties.

An important characteristic of a confocal microscope scan is that in this high resolution, incoherent illumination models fail to describe the scattering process and coherent propagation leads to speckle variations. Speckles are the result of constructive and destructive interference effects between light paths, which manifest as high resolution noise-like variations in the captured data. These noise-like patterns make the capture data challenging to interpret.

Previous approaches to material acquisition mostly relied on incoherent illumination models, which fail to account for speckle variations. These include inverse rendering techniques which attempt to explicitly model the scattering process and invert it either analytically [1,6–16], or using Monte-Carlo rendering [17–25]. Another line of research is based on learning from training data [26–28].

While machine learning techniques have revolutionized the world, their application to microscopy has been more limited. This is in part due to the fact that the success of a learning algorithm heavily depends on the quality of the provided training data. In some application scenarios, such training data can be easily collected or rendered using computer graphics simulation techniques. In microscopy, the collection of training data is much less straightforward. Generating phantoms with ground truth material properties is not simple and measuring such phantoms in the lab can be a tedious manual process. At the same time, classical computer graphics rendering strategies are based on incoherent ray tracing models and cannot simulate realistic speckle variations.

In this work we attempt to train a learning network for microscopic acquisition of scattering parameters, using training data that simulates coherent scattering and speckle variation. Fig. 1 illustrates our pipeline.

Fig. 1. A visualization of our estimation pipeline

Download Full Size | PDF

Classical optics approaches to the simulation of coherent propagation rely on explicit solutions to the wave equation [29–31], which are computationally prohibitive and impractical for real size volumes. Rather, in this work we adopt a recent approach to speckle simulation [32,33] based on computer graphics ray tracing algorithms. This approach is orders of magnitude more efficient than classical optics solvers and scale to much larger scenes. Furthermore, it has been shown to be physically accurate and produces the exact same speckle statistics. We use this simulator to synthesize training data describing coherent propagation in tissue, exploiting previously reported tissue characteristics [10,11,34]. With this data we train a network for a volumetric reconstruction of the density of scattering parameters. We analyze multiple aspects of the network design, offering observations on the following fronts:

1. We explore multiple learning strategies and propose a V-net architecture which is successful in reconstructing piecewise smooth densities with high frequency transitions.
2. We explore the difference between coherent and incoherent simulation of the scattering process, and show that correctly modeling coherent speckle effects is crucial for successful training.
3. We explore the usage of speckle statistics such as the memory effect, and test what measurement strategies provide better input features to the network.

Putting together, this paper offers a detailed exploration of various design choices in the training of volumetric reconstruction networks. We hope our findings will pave the road for future development of real microscopy acquisition systems, and the incorporation of machine learning strategies in this challenging domain.

2. Related work

In this work we seek a volumetric reconstruction of bulk material parameters, such as the density of scattering particles. Most previous approaches used tomographic reconstruction as well as inverse rendering techniques relying on incoherent illumination models [1,6–23].

In diffused correlation spectroscopy [2,35–37] one attempts to utilize temporal speckle variations to reconstruct volumetric material parameters. However, it relies on diffusion models and is valid only for thick, heavily scattering volumes where resolution is already quite limited. In contrast, our work targets low and modest thickness where more information can be recovered but exact modeling is harder. Also we utilize spatial speckle correlations rather than temporal ones.

For optically thin volumes, diffraction tomography seeks a volumetric reconstruction of the refractive index variations in the volume from measurements under multiple illumination and viewing directions. This was traditionally approached by various non-linear optimization strategies attempting to invert the wave equation [38–41], and recently also with deep learning [42,43]. Despite significant research progress, this problem is notably harder than the one addressed in this paper, and for many applications one is more interested in a statistical bulk material description rather than exact wavelength scale variations.

There is also research on acquisition or recognition of homogeneous materials using either incoherent or coherent scattered light [19,26,28,44]. In contrast, in this research we seek high-resolution near-field volumetrically varying material reconstruction.

Another line of research attempts to see through or inside scattering media. Rather than recovering the scattering medium, it attempts to remove the artifacts it induces on a latent illumination source. Classical approaches rely on detecting auto-correlations induced by the memory effect [45–47]. Recently, machine learning techniques have been successively applied [48–53] to enhance results.

3. Data formation

We start with a problem statement and a description of data synthesis. In Sec. 4 we proceed to describe the learning network, and in Sec. 5 discuss our observations.

3.1 Problem formulation

We consider a volumetric scattering material $\cal V$ described by spatially varying bulk parameters $\sigma _t(\textbf {o}),\sigma _a(\textbf {o}),\sigma _s(\textbf {o}),\rho (\textbf {o},\hat{\textbf {i}}, \hat{\textbf {v}})$, where $\sigma _t(\textbf {o})$ is the extinction coefficient at position $\textbf {o}$. $\sigma _t(\textbf {o})$ is proportional to the local density of scattering particles at position $\textbf {o}$, and through this paper we will be referring to it as the description of volumetric density. $\sigma _a(\textbf {o})$ is the absorption coefficient, $\sigma _s(\textbf {o})=\sigma _t(\textbf {o})-\sigma _a(\textbf {o})$ the scattering coefficient, and $\rho (\textbf {o}, \hat{\textbf {i}}, \hat{\textbf {v}})$ is the local phase function at position $\textbf {o}$, describing the amount of light scattered, upon interaction with a scattering particle, from an incoming direction $\hat{\textbf {i}}$ toward outgoing direction $\hat{\textbf {v}}$. Through this paper we mostly assume that $\sigma _a$ and $\rho$ are spatially constant and given. We attempt to estimate volumetric variations in the density $\sigma _t(\textbf {o})$. The mean free path ($MFP$) of the material equals the average distance that a light travels between two scattering events. For a homogeneous material, one can show that $MFP=1/\sigma _t$. The optical depth ($OD$) of the material is the average number of scattering events on a light path through the scattering volume.

Our goal is to image the material under multiple illumination and viewing conditions and recover its volumetric density $\sigma _t(\textbf {o})$.

To this end we aim to generate training data of volumes with ground truth bulk parameters, and render the scattered images they produce. In Fig. 2 we illustrate a typical 3D volume in our dataset with spatially varying particle densities. We will be visualizing the volume using a sequence of 2D slices as illustrated in Fig. 2(a). As we aim to estimate a 3D volumetric density of the volume $\sigma _t(\textbf {o})$, we consider confocal microscopy images, which can provide the highest resolution scanning of a 3D volume. In confocal microscopy one focuses light into a single point in the volume and images with a lens focused at the same spot. By scanning the focal point in 3D and capturing multiple shots, a volumetric image is achieved.

Fig. 2. A typical 3D volume in our dataset. (a) 10 different z-slices through the volume. We also show (b) incoherent and (c) coherent renderings of the 2D confocal images in these planes. Due to defocus and multiple scattering we observe non zero intensities even in depth planes where no scattering particles are present and $\sigma _t=0$. Note that the color values in (a) encode variation in density $\sigma _t$, while in (b,c) color encodes the intensity of the captured images.

Download Full Size | PDF

3.2 Data generation

Incoherent rendering: The first approach we take here follows classical computer graphics rendering strategies [54], to produce smooth incoherent intensity images. These algorithms use Monte-Carlo strategies to trace paths of scattered light in the volume, while treating the light as a set of incoherent rays with no phase.

Coherent rendering: Incoherent simulation models leads to smooth speckle free images. In practice, microscopy rules [55,56] state that to image speckle free data, the numerical aperture of the illumination, namely, the range of angles at which illumination is arriving the imaged spot should be wider than the numerical aperture of the imaging objectives. In this work we target high resolution confocal volumetric scanning, and we simulate our data at $NA=0.5$ which is approaching the limit of what objectives can achieve without significant aberrations. As we cannot increase the NA of the illumination much further, to eliminate speckles we can only reduce the NA of the imaging objective, which will unavoidably lead to reduced resolution. We further elaborate on the trade-off between resolution and speckle reduction in Supplement 1. To correctly model the high resolution data imaged by a confocal microscope our simulator must account for coherent speckle effects. Moreover, in this work we argue that coherent speckle effects are an additional source of information which can lead to superior results compared even with idealized high resolution incoherent models.

Figure 2 visualizes the difference between coherent and incoherent simulations. Simulating realistic speckle effects is a longstanding challenge in the optics literature, where the classical approach relies on solving the wave equation explicitly. While exact wave solvers exist [29–31], they are computationally prohibitive and usually do not scale to any real sized problem. Recently, [32,33] have introduced a physically correct speckle simulator adopting Monte-Carlo ray tracing strategies. This simulator is orders of magnitudes more efficient than the naive wave solvers and scales to much larger scenes.

We note that from the same bulk density we can sample multiple speckle realizations, each representing a different instantiation of scattering particles sampled from the same volumetric density $\sigma _t(\textbf {o})$.

Data geometry: We generated three training databases, with three levels of complexity. Figure 3 visualizes 2D slices of typical volumes in each group. The simpler one (dataset 1) consists of homogeneous cubes, with a uniform $\sigma _t$ density value inside it, and $\sigma _t=0$ (no scatterers) outside the cube. The two other datasets are heterogeneous, containing multiple sub-cubes with different densities. In dataset 2 heterogeneity is present only along one axis, and in dataset 3 we have more challenging heterogeneity along all 3 axes. We sample the $\sigma _t$ values of each cube such that we achieve optical densities in the range $[0.5,5]$. For simplicity all cubes are constrained into a volume of size $50\lambda \times 50\lambda \times 40\lambda$ where $\lambda$ is the wavelength. Tissue is known to have forward scattering phase functions [10,11,34], and thus, we simulated the data using the Henyey-Greenstein phase function [57] with anisotropy parameters sampled in the range $0.93-0.97$. We used numerical aperture 0.5 for both illumination and imaging.

Fig. 3. Illustrating the 3 datasets used in this paper. Dataset 1 is the simplest, consisting of homogeneous cubes, Dataset 2 is heterogeneous along one axis, and Dataset 3 is hardest, heterogeneous along all 3 axes. (a) Ground truth. (b) Incoherent rendering. (c) Coherent rendering. As before, colorbar in (a) represents densities, and in (b,c) image intensities.

Download Full Size | PDF

We note that the original simulator of [32,33] considered homogeneous volumes only. We have extended it to simulate heterogeneous volumes with spatially varying densities, but a geometry consisting of a few cubes is significantly easier to render than general topologies, leading to the specific structure of our two heterogeneous datasets.

Confocal imaging geometry: We simulate a confocal microscope imaging geometry where we send an illumination beam focused at a 3D point $\textbf {o}$ and image with a lens focused at the same spot. We scan the volume in a grid of $50\times 50\times 10$ focus points as illustrated in Fig. 4(left). We note that as the optical densities (OD) of the media we target are high, light scatters multiple times and this confocal scan does not reveal a clear image of the medium. However, confocal microscopy theory [58] shows that by focusing the illumination at a single point rather than using a wide field illumination, one can eliminate some of the scattered light and improve resolution.

Fig. 4. Imaging setup: we scan the 3D volume with a confocal setup where a light source is focused at a single point in the volume and the sensor is focused at another. In a pure confocal setting the illumination and sensing spots are identical, while in a displaced confocal scan the sensing spot is displaced with respect to the illumination spot. We mark the displacement using $\boldsymbol {\tau }$. We scan the volume using 10 z slices, with $50 {\times } 50$ grid points in each plane.

Download Full Size | PDF

We used a transmission geometry, where the source is on one side of the sample and the sensor at another, rather than a reflective geometry where both are at the same end. This is due to the fact that we aim to study how memory effect correlations can improve the estimation, and for forward scattering materials such as tissue, memory effect in transmission geometry is much stronger than in reflection geometry [59]. Classical frequency analysis of a confocal microscope under coherent illumination [58] states that it can achieve optical sectioning (distinguish objects at different $z$-planes) only in reflection mode, and loses this ability in transmission mode. To add more powerful depth features, we extend the classical confocal acquisition and consider also displaced confocal images (Fig. 4(Right)). In these configurations we focus the illumination beam at one 3D point $\textbf {o}$ and focus the sensing point at a nearby 3D position $\textbf {o}+\boldsymbol {\tau }$. This configuration has another important advantage, as it allows us to capture memory effect correlations between speckles. We will show below that when the displacement between illumination and viewing points is maintained, the speckles are correlated, and a learning network can take advantage of these correlations. Note that in practice displaced confocal measurements can be captured without a significant increase in acquisition time. When scanning the confocal illumination, rather than using a single pixel detector for the scattered light, one should use a 2D sensor and maintain the 2D image of the scattered light around the central spot [60].

We denote a complex confocal field obtained by a source and sensor focused at the same 3D point $\textbf {o}$ using $u(\textbf {o})$. Our standard confocal dataset consists of $M=1000$ arrays corresponding to different volumetric densities. Each array has the form $u^m(\textbf {o})$, where $\textbf {o}$ runs over a $50\times 50\times 10$ grid. The displaced confocal dataset includes fields of the form $u^m(\textbf {o},\boldsymbol {\tau })$ where the illumination is focused at $\textbf {o}$ and the sensing point is focused at $\textbf {o}+\boldsymbol {\tau }$, the displacement $\boldsymbol {\tau }$ is illustrated in Fig. 4(Right). We render a grid of $50\times 50\times 10$ focal points with $5\times 5$ displacements $\boldsymbol {\tau }$ for each. We denote the intensity image corresponding to each complex field by $I^m(\textbf {o},\boldsymbol {\tau })=|u^m(\textbf {o},\boldsymbol {\tau })|^2$.

We acknowledge that the $50\times 50\times 10 \times 5 \times 5$ cubes we render only cover a small volume. We avoid rendering bigger ones as for $M=1000$, such 3D volumes (with all displacements) already result in a very large amount of data, which poses non trivial challenges in both rendering and learning stages.

4. Deep learning models

Our goal is to take as input a confocal scan of the volume $I(\textbf {o})$, or a displaced confocal scan $I(\textbf {o},\boldsymbol {\tau })$, and predict the volumetric density $\sigma _t(\textbf {o})$.

4.1 Architecture

We build on recent success of V-Net architectures [61,62], and expand it to our 3D grid as in [63] by utilizing 3D convolutions rather than the classical 2D ones.

The V-net architecture uses an encoder-decoder architecture with skip connections between the same levels of resolution. This architecture has been shown to be beneficial in propagating information between different image resolutions and yet retrieving crisp high resolution edges. On each resolution level, for both decoder and encoder, we apply a sequence of two 3D convolutions with the same number of output features and use zero padding to maintain the image dimensions. Each convolution is followed by a LeakyReLU activation function. We use skip connections, and add the input of each two-layer convolution block to the output. In the encoder part the result is downsampled using convolution with stride $=2$, and we also add a skip connection, forwarding it directly to the decoder on the same level. On the other hand, in the decoder, we use convolutions with stride $=0.5$ to up-sample lower resolution and concatenate the result with the skipped connection from the encoder. When we reduce resolution we increase the number of features in the following layer as in prior V-Net implementations. In the last level of resolution, the latent layer, we add Dropout to prevent overfitting.

Following Deeply Supervised Networks (DSN) [64,65] ideas, we attempt to better constrain the learning process by matching the target density map at each level of resolution rather than only the finest one.

To this end we denote by $\sigma _t^{m,j}$ the ground truth of the $m$’th training volume down-sampled to the $j$’th level of resolution, with

(1)$${\sigma_t}^{m,j} = {\sigma_t}^{m}\downarrow^{2^j}.$$

For $j=0$ we get the original high resolution. Our network includes 3 levels of resolution.

Our learning process optimizes the following multi-level loss

(2)$$\mathcal{L} = \frac{1}{M}\sum_{m=0}^{M}\sum_{j=0}^{2} \sum_{\textbf{o}\in {\cal V}}MSE(\hat{\sigma_t}^{m,j}(\textbf{o}),\sigma_t^{m,j}(\textbf{o}))$$

where $\hat {\sigma _t}$ denotes the density volume estimated by the network in each resolution level.

Our architecture is demonstrated in Fig. 5. As we explain below, we considered here a few different learning tasks, each of them involved a different number of input features.

Fig. 5. DL Architecture. We use 3D convolution in all models. The V-Net architecture uses residual connections between the encoding and decoding parts. Using element wise sum we implement residual connections.

Download Full Size | PDF

4.2 Training strategy

We train using the Adam [66] optimizer. We divide each dataset into $80\%$ training and $20\%$ test samples. We use a 5-fold cross validation, where each time we leave out $20\%$ of our training data and run a separate training process. The charts below display average performance, and present error bars representing the variance between these 5 repetitions. We train the network on slices of 5 z-planes. In practice our dataset is rendered with 10 z-planes, so we crop multiple slices of 5 planes from each sample and use them for augmentation. We use random batches of 240 training samples.

5. Experimental results

We demonstrate results from our learning framework and analyze various design considerations. We divide our analysis into three fronts.

• Learning strategies: we show that the deep learning approach outperforms naive least squares estimation, and moreover, the V-net architecture outperforms a simpler CNN architecture.
• Speckle data: we compare networks trained on speckle data with those trained using classical incoherent rendering strategies and show that as microscopy data cannot be measured without speckles, including these speckles in the training stage is vital.
• Feature space: we study how speckle statistics and in particular, features capturing memory effect correlation can further enhance performance.

5.1 Learning strategies

For simplicity we start by evaluating here only the pure confocal datasets and defer the displaced confocal features to Sec. 5.3.

Before turning into advanced deep learning strategies, we evaluate a simpler linear reconstruction. That is, we estimate $\hat {\sigma _t}$ as a 3D convolution of the speckle data $I$:

(3)$$\hat{\sigma_t}^m = I^m \star H$$

We select the $H$ minimizing the least square reconstruction error on the training data. We select the kernel support as $3\times 3 \times 3$ using cross validation.

Secondly, we compare the multi-resolution V-net architecture with a simple CNN one. Our CNN architecture is equivalent to the finest level of our V-net, consisting of 3D convolution and activation layers, see illustration in Fig. 5.

We compare the performance of our V-net architecture with the CNN architecture and the simple least square strategy, showing that the V-net architecture outperforms these by a large margin. We repeat the comparison on our 3 training-sets. Figure 6 demonstrates numerical results on the confocal only features. The cart evaluates the average least square difference between the ground truth and estimated density maps. Error bars encode the variance using multiple training processes on different splits of the training vs. validation data. In Fig. 6 we also visualize the different estimation strategies by looking at the different z-slices of one test volume. The V-net results are closer to the ground truth. The multi-level V-net architecture leads to better regularization with sharper, piecewise smooth estimates, while the single scale CNN and the linear estimate are over-smoothed, and yet demonstrate a lot of speckle noise. We further improve these results in Sec. 5.3 below using displaced measurements.

Fig. 6. Visual comparison of different estimation strategies using confocal measurements. (a) Ground truth (b) Linear estimation (c) CNN (d) Our proposed V-net architecture.

Download Full Size | PDF

5.2 Speckle utilization

In this section we study the importance of including speckle data in the training phase. In Fig. 7 we first compare training on the incoherent speckle free data and training on the speckle data. Training on incoherent speckle free data and testing on the same incoherent simulation leads to good results. However as explained in Supplement 1, such measurements are unrealistic in a real acquisition setup. By training with realistic speckles we achieve comparable results, and furthermore, coherent speckles open the door for additional information due to memory effect correlations as discussed in Sec. 5.3 below. While one may consider using the incoherent simulator for training and hoping that it will generalize to real lab speckle measurements, we see that training on incoherent rendered volumes and testing on the speckle intensity data leads to order of magnitude performance degradation. This highlights the importance of using training data with realistic speckle variations.

Fig. 7. Visualizing reconstruction results, illustrating the importance of incorporating realistic speckle effects in the training data. (a) Ground truth. (b) Training and testing on incoherent rendering. (c) Training on incoherent data, testing on speckle data. (d) Training on speckle data and testing on speckle data. (e) Training on blurred incoherent data and testing on blurred speckles. (f) Training on blurred speckles, testing on blurred speckles.

Download Full Size | PDF

As another attempt to lower the gap between incoherent rendering abilities and realistic speckle data, we blurred the speckle intensity data, which is equivalent to imaging with a lower numerical aperture. These blurred speckle intensities have lower resolution compared to the actual speckle rendering but also lower speckle variance, see visualization in Fig. 8(b,d). Yet, they are not completely speckle free as we only blurred over a limited support of $5\lambda \times 5 \lambda$. Training and testing on these blurred speckle intensities leads to lower quality results when compared with training on the high resolution speckle images as demonstrated in Fig. 7.

Fig. 8. Visualizing the different network inputs and the three different datasets with different levels of heterogeneity. We visualize an xy plane sliced at a single z value. We demonstrate: (a) the ground truth density $\sigma _t$n. (b) a rendered speckle image, (c) the incoherent (speckle free) rendering, (d) blurred speckles, (e) blurred incoherent renderings.

Download Full Size | PDF

While our original incoherent simulator assumed idealized high resolution which cannot be achieved in practice, we also blurred it with a $5\lambda \times 5 \lambda$ to match the blurred coherent simulation above. As the blurred speckle intensities are not completely speckle free, there is still a gap between the training data and testing data, leading to degraded results in Fig. 7.

In all cases, there are differences between the three different training datasets, and heterogeneous datasets have increased complexity compared to the homogeneous ones.

5.3 Feature space

Next we consider additional features that can enhance the learning process.

Displaced confocal measurements: As illustrated in Fig. 4, in displaced measurements we assume that the focused sensing spot is displaced relative to the focused illumination point. We consider a grid of $5\times 5$ displacements $\boldsymbol {\tau }$. Due to the memory effect of speckles, these displaced measurements are correlated with the standard confocal measurements and by exploiting this correlation we enhance the network performance. We thus train a V-net which takes as input 25 additional features, corresponding to speckle intensity measurements at displaced positions. We demonstrate the numerical gain from these features in Fig. 9. The addition of the displacement features results in finer grain reconstructions as visualized in Fig. 9. One motivation for these features was first introduced by [67] and then re-implemented by [68] as off-diagonal probing. Their idea is that when we focus the light source and sensor at the same spot (confocal), we capture both direct light paths that scattered once and indirect light paths that scattered multiple times. On the other hand, if we illuminate one spot and measure at a nearby one we do not capture direct single scattering paths, while the multiple scattering component is similar to that of the confocal measurement. Thus, by subtracting the displaced confocal measurement from the confocal one, they can maintain pure single scattering images. This scheme is demonstrated visually in Fig. 10. With a deep learning architecture we learn a linear combination of the confocal and displaced confocal features, and subtracting one feature from the other is one option inside our search space, hence we can learn an estimation rule which generalizes this idea.

Fig. 9. Visualizing reconstruction results, demonstrating the benefits in additional features. (a) Ground truth. (b) Pure confocal measurements. (c) Incorporating displaced confocal speckle intensities. (d) Incorporating complex products of complex features.

Download Full Size | PDF

Fig. 10. Fuchs et al. [67] enhance direct illumination using displaced confocal measurements. In (a) we plot the idealized paths of direct illumination and imaging. (b) In practice, a lot of scattered light will be included in the measurement. (c) In a displaced confocal measurement the direct path is not present, but multiple scattering paths are rather similar. Thus by subtracting (c) from (b) as in [67], it is possible to enhance the direct image component.

Download Full Size | PDF

Correlation features: In an effort to directly utilize speckle correlations we consider features which measure the complex products of displaced fields, of the form

(4)$$C^m(\textbf{o},\boldsymbol{\tau},\boldsymbol{\Delta})=u^m(\textbf{o},\boldsymbol{\tau})\cdot u^m(\textbf{o}+\boldsymbol{\Delta},\boldsymbol{\tau})^*,$$

where $^*$ denotes complex conjugation. We note that using $\boldsymbol {\Delta }=0$ in Eq. (4) corresponds to the standard displaced confocal intensity measurements considered above. For $\boldsymbol {\Delta }\neq 0$, the features are complex, and acquiring them in an optical system would require a non trivial interferometric implementation. In our synthetic simulation we are free to study the potential gain from such features, which will allow us to pre-assess whether building such an interferometric system would be beneficial.

As we render $25$ displacement values $\boldsymbol {\tau }$, we have a set of $25^2$ correlation features we can consider, but inserting all of them into the network is computationally prohibitive. To relax the burden, we followed the analysis of [69] to select $96$ pairs where higher correlation is expected. These $96$ features included the $25$ intensity measurements from the previous analysis.

Figure 9 demonstrates the gain from such correlation features numerically and visually. Overall, the displaced confocal intensity features improve over the standard confocal measurements, and the complex correlation features improve further.

6. Conclusion

This paper explores the application of deep learning for microscopic volumetric reconstruction of material density. We design a V-net architecture which is capable of regularizing the estimation in multiple resolutions leading to sharp piecewise smooth estimates. We study the importance of modeling realistic speckles as part of the training phase and show that for high resolution microscopic estimation, coherent simulation strategies are a must, while classical incoherent simulation strategies fail completely. We also explore feature spaces which can help the network make better usage of the statistical properties of speckles, such as the memory effect. We hope that this study will pave the way for a real world implementation of microscopic volumetric reconstruction, and will open the door for an application of machine learning techniques in this challenging domain.

Funding

Ollendorff Minerva Center for Vision and Image Sciences; H2020 European Research Council (635537); United States-Israel Binational Science Foundation (2008123/2019758); Israel Science Foundation (1947/20).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

References

1. D. A. Boas, D. H. Brooks, E. L. Miller, C. A. DiMarzio, M. Kilmer, R. J. Gaudette, and Q. Zhang, “Imaging the body with diffuse optical tomography,” IEEE Signal Process. Mag. 18(6), 57–75 (2001). [CrossRef]

2. T. Durduran, R. Choe, W. B. Baker, and A. G. Yodh, “Diffuse optics for tissue monitoring and tomography,” Rep. Prog. Phys. 73(7), 076701 (2010). [CrossRef]

3. B. J. Berne and R. Pecora, Dynamic light scattering: with applications to chemistry, biology, and physics (Courier Corporation, 2000).

4. D. Sumin, T. Rittig, V. Babaei, T. Nindel, A. Wilkie, P. Didyk, B. Bickel, J. KŖivánek, K. Myszkowski, and T. Weyrich, “Geometry-aware scattering compensation for 3d printing,” ACM Trans. Graph. 38(4), 1–14 (2019). [CrossRef]

5. D. Pine, D. Weitz, J. Zhu, and E. Herbolzheimer, “Diffusing-wave spectroscopy: dynamic light scattering in the multiple scattering limit,” J. Phys. 51(18), 2101–2127 (1990). [CrossRef]

6. H. Jensen, S. Marschner, M. Levoy, and P. Hanrahan, “A practical model for subsurface light transport,” in Proceedings of SIGGRAPH 2001, Annual Conference Series, (2001).

7. C. Donner and H. Jensen, “Light diffusion in multi-layered translucent materials,” ACM Trans. Graph. 24(3), 1032–1039 (2005). [CrossRef]

8. M. Papas, C. Regg, W. Jarosz, B. Bickel, P. Jackson, W. Matusik, S. Marschner, and M. Gross, “Fabricating translucent materials using continuous pigment mixtures,” ACM Trans. Graph. 32(4), 1–12 (2013). [CrossRef]

9. J. Wang, S. Zhao, X. Tong, S. Lin, Z. Lin, Y. Dong, B. Guo, and H. Shum, “Modeling and rendering of heterogeneous translucent materials using the diffusion equation,” ACM Trans. Graph. 27(1), 1–18 (2008). [CrossRef]

10. W.-F. Cheong, S. A. Prahl, and A. J. Welch, “A review of the optical properties of biological tissues,” IEEE J. Quantum Electron. 26(12), 2166–2185 (1990). [CrossRef]

11. V. Tuchin, Tissue optics: light scattering methods and instruments for medial diagnosis,” (SPIE, 2000).

12. C. Liu, A. K. Maity, A. W. Dubrawski, A. Sabharwal, and S. G. Narasimhan, “High resolution diffuse optical tomography using short range indirect subsurface imaging,” 2020 IEEE International Conference on Computational Photography (ICCP), (2020).

13. T. Hawkins, P. Einarsson, and P. Debevec, “Acquisition of time-varying participating media,” ACM Trans. Graph. 24(3), 812–815 (2005). [CrossRef]

14. C. Fuchs, T. Chen, M. Goesele, H. Theisel, and H. Seidel, “Density estimation for dynamic volumes,” Comput. & Graph. 31(2), 205–211 (2007). [CrossRef]

15. J. Gu, S. Nayar, E. Grinspun, P. Belhumeur, and R. Ramamoorthi, “Compressive structured light for recovering inhomogeneous participating media,” Proceedings of the European Conference on Computer Vision 2008, 845–858 (2008).

16. S. Narasimhan, M. Gupta, C. Donner, R. Ramamoorthi, S. Nayar, and H. Jensen, “Acquiring scattering properties of participating media by dilution,” ACM Trans. Graph. 25(3), 1003–1012 (2006). [CrossRef]

17. S. Prahl, M. van Gemert, and A. Welch, “Determining the optical properties of turbid media by using the adding–doubling method,” Appl. Opt. 32(4), 559–568 (1993). [CrossRef]

18. V. Antyufeev, Monte Carlo method for solving inverse problems of radiation transfer, vol. 20 (Inverse and Ill-Posed Problems Series, V.S.P. International Science, 2000).

19. I. Gkioulekas, S. Zhao, K. Bala, T. Zickler, and A. Levin, “Inverse volume rendering with material dictionaries,” ACM TOG 32(6), 1–13 (2013). [CrossRef]

20. I. Gkioulekas, A. Levin, and T. Zickler, “An evaluation of computational imaging techniques for heterogeneous inverse scattering,” European Conference on Computer Vision (2016).

21. A. Levis, Y. Schechner, A. Aides, and A. Davis, “Airborne three-dimensional cloud tomography,” IEEE International Conference on Computer Vision (2015).

22. A. Levis, Y. Y. Schechner, and A. B. Davis, “Multiple-scattering microphysics tomography,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), pp. 6740–6749.

23. P. Khungurn, D. Schroeder, S. Zhao, K. Bala, and S. Marschner, “Matching real fabrics with micro-appearance models,” ACM Trans. Graph. 35(1), 1–26 (2015). [CrossRef]

24. A. Geva, Y. Y. Schechner, Y. Chernyak, and R. Gupta, “X-ray computed tomography through scatter,” Proceedings of the European Conference on Computer Vision 2018, 34–50 (2018).

25. T. Loeub, A. Levis, V. Holodovsky, and Y. Schechner, “Monotonicity prior for cloud tomography,” Proceedings of the European Conference on Computer Vision 2020, 283–299 (2020). [CrossRef]

26. C. Che, F. Luan, S. Zhao, K. Bala, and I. Gkioulekas, “Towards learning-based inverse subsurface scattering,” 2020 IEEE International Conference on Computational Photography (ICCP), (2020).

27. I. Gkioulekas, B. Walter, K. Bala, T. Zickler, and E. Adelson, “On the appearance of translucent edges,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

28. M. D. Dogan, S. V. Acevedo Colon, V. Sinha, K. Akşit, and S. Mueller, “Sensicut: Material-aware laser cutting using speckle sensing and deep learning,” The 34th Annual ACM Symposium on User Interface Software and Technology, (2021).

29. B. Thierry, X. Antoine, C. Chniti, and H. Alzubaidi, “μ-diff: An open-source matlab toolbox for computing multiple scattering problems by disks,” Comput. Phys. Commun. 192, 348–362 (2015). [CrossRef]

30. B. E. Treeby and B. T. Cox, “k-wave: Matlab toolbox for the simulation and reconstruction of photoacoustic wave-fields,” J. Biomed. Opt. 15(2), 021314 (2010). [CrossRef]

31. K. Yee, “Numerical solution of initial boundary value problems involving maxwell’s equations in isotropic media,” IEEE Trans. Antennas Propagat. 14(3), 302–307 (1966). [CrossRef]

32. C. Bar, M. Alterman, I. Gkioulekas, and A. Levin, “A monte carlo framework for rendering speckle statistics in scattering media,” ACM TOG 38(4), 1–22 (2019). [CrossRef]

33. C. Bar, I. Gkioulekas, and A. Levin, “Rendering near-field speckle statistics in scattering media,” ACM TOG 39(6), 1–18 (2020). [CrossRef]

34. T. Igarashi, K. Nishino, and S. K. Nayar, “The appearance of human skin: a survey,” Found. Trends Comput. Graph. Vis. 3(1), 1–95 (2007). [CrossRef]

35. L. Gagnon, M. Desjardins, J. Jehanne-Lacasse, L. Bherer, and F. Lesage, “Investigation of diffuse correlation spectroscopy in multi-layered media including the human head,” Opt. Express 16(20), 15514–15530 (2008). [CrossRef]

36. J. Sutin, B. Zimmerman, D. Tyulmankov, D. Tamborini, K. C. Wu, J. Selb, A. Gulinatti, I. Rech, A. Tosi, D. A. Boas, and M. A. Franceschini, “Time-domain diffuse correlation spectroscopy,” Optica 3(9), 1006–1013 (2016). [CrossRef]

37. E. M. Buckley, A. B. Parthasarathy, P. E. Grant, A. G. Yodh, and M. A. Franceschini, “Diffuse correlation spectroscopy for measurement of cerebral blood flow: future prospects,” Neurophotonics 1(1), 011009 (2014). [CrossRef]

38. S. Chowdhury, M. Chen, R. Eckert, D. Ren, F. Wu, N. Repina, and L. Waller, “High-resolution 3d refractive index microscopy of multiple-scattering samples from intensity images,” Optica 6(9), 1211–1219 (2019). [CrossRef]

39. L. Tian and L. Waller, “3d intensity and phase imaging from light field measurements in an led array microscope,” Optica 2(2), 104–111 (2015). [CrossRef]

40. W. Choi, C. Fang-Yen, K. Badizadegan, S. Oh, N. Lue, R. R. Dasari, and M. S. Feld, “Tomographic phase microscopy,” Nat. Methods 4(9), 717–719 (2007). [CrossRef]

41. S. Lee, K. Kim, A. Mubarok, A. Panduwirawan, K. Lee, S. Lee, H. Park, and Y. Park, “High-resolution 3-d refractive index tomography and 2-d synthetic aperture imaging of live phytoplankton,” J. Opt. Soc. Korea 18(6), 691–697 (2014). [CrossRef]

42. Y. Sun, Z. Xia, and U. S. Kamilov, “Efficient and accurate inversion of multiple scattering with deep learning,” arXiv:1803.06594 (2018).

43. V. Sitzmann, J. N. Martel, A. W. Bergman, D. B. Lindell, and G. Wetzstein, “Implicit neural representations with periodic activation functions,” 34th Conference on Neural Information Processing Systems (2020).

44. E. Valent and Y. Silberberg, “Scatterer recognition via analysis of speckle patterns,” Optica 5(2), 204–207 (2018). [CrossRef]

45. J. Bertolotti, E. G. van Putten, C. Blum, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Non-invasive imaging through opaque scattering layers,” Nature 491(7423), 232–234 (2012). [CrossRef]

46. O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlation,” Nat. Photonics 8(10), 784–790 (2014). [CrossRef]

47. J. Chang and G. Wetzstein, “Single-shot speckle correlation fluorescence microscopy in thick scattering tissue with image reconstruction priors,” J. Biophotonics 11(3), e201700224 (2018). [CrossRef]

48. Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica 5(10), 1181–1190 (2018). [CrossRef]

49. S. Li, M. Deng, J. Lee, A. Sinha, and G. Barbastathis, “Imaging through glass diffusers using densely connected convolutional networks,” Optica 5(7), 803–813 (2018). [CrossRef]

50. R. Horisaki, R. Takagi, and J. Tanida, “Learning-based imaging through scattering media,” Opt. Express 24(13), 13738–13743 (2016). [CrossRef]

51. M. Lyu, H. Wang, G. Li, S. Zheng, and G. Situ, “Learning-based lensless imaging through optically thick scattering media,” Adv. Photonics 1(03), 1–10 (2019). [CrossRef]

52. Y. Sun, X. Wu, Y. Zheng, J. Fan, and G. Zeng, “Scalable non-invasive imaging through dynamic scattering media at low photon flux,” Opt. Lasers Eng. 144, 106641 (2021). [CrossRef]

53. Y. Gao, W. Xu, Y. Chen, W. Xie, and Q. Cheng, “Deep learning-based photoacoustic imaging of vascular network through thick porous media,” arXiv:2103.13964 (2021).

54. P. Dutre, K. Bala, and P. Bekaert, Advanced Global Illumination (A K Peters, Natick, MA, 2006 (2nd edition)).

55. J. W. Goodman, Speckle Phenomena in Optics: Theory and Applications (Roberts and Company Pub., 2007).

56. J. Mertz, Introduction to optical microscopy. (Roberts, 2010).

57. L. Henyey and J. Greenstein, “Diffuse radiation in the galaxy,” The Astrophys. J. 93, 70–83 (1941). [CrossRef]

58. J. Mertz, Introduction to optical microscopy (Cambridge University Press, 2019).

59. C. Bar, M. Alterman, L. Gkioulekas, and A. Levin, “Single scattering modeling of speckle correlation,” in 2021 IEEE International Conference on Computational Photography (ICCP), (2021), pp. 1–16.

60. T. I. Sommer and O. Katz, “Pixel-reassignment in ultrasound imaging,” Appl. Phys. Lett. 119(12), 123701 (2021). [CrossRef]

61. F. Milletari, N. Navab, and S.-A. Ahmadi, “V-net: Fully convolutional neural networks for volumetric medical image segmentation,” 2016 Fourth International Conference on 3D Vision (3DV), (2016).

62. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (2015).

63. Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, “3d u-net: Learning dense volumetric segmentation from sparse annotation,” Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016 (2016).

64. C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu, “Deeply-supervised nets,” arXiv:1409.5185 (2014).

65. Q. Dou, L. Yu, H. Chen, Y. Jin, X. Yang, J. Qin, and P.-A. Heng, “3d deeply supervised network for automated segmentation of volumetric medical images,” Med. Image Anal. 41, 40–54 (2017). [CrossRef]

66. D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 3rd International Conference for Learning Representations (2015).

67. C. Fuchs, M. Heinz, M. Levoy, H.-P. Seidel, and H. P. A. Lensch, “Combining confocal imaging and descattering,” Comput. Graph. Forum 27(4), 1245–1253 (2008). [CrossRef]

68. M. O’Toole, R. Raskar, and K. N. Kutulakos, “Primal-dual coding to probe light transport,” ACM Trans. Graph. 31(4), 1–11 (2012). [CrossRef]

69. C. Bar, M. Alterman, I. Gkioulekas, and A. Levin, “Single scattering modeling of speckle correlation,” in 2021 IEEE International Conference on Computational Photography (ICCP), (2021).

Towards machine learning for heterogeneous inverse scattering in 3D microscopy

Abstract

1. Introduction

2. Related work

3. Data formation

3.1 Problem formulation

3.2 Data generation

4. Deep learning models

4.1 Architecture

4.2 Training strategy

5. Experimental results

5.1 Learning strategies

5.2 Speckle utilization

5.3 Feature space

6. Conclusion

Funding

Disclosures

Data availability

Supplemental document

References

Supplementary Material (1)

Data availability

Cited By

Figures (10)

Equations (4)

Optics Express