Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Anti-noise diffractive neural network for constructing an intelligent imaging detector array

Open Access Open Access

Abstract

To develop an intelligent imaging detector array, a diffractive neural network with strong robustness based on the Weight-Noise-Injection training is proposed. According to layered diffractive transformation under existing several errors, an accurate and fast object classification can be achieved. The fact that the mapping between the input image and the label in Weight-Noise-Injection training mode can be learned, means that the prediction of the optical network being insensitive to disturbances so as to improve its noise resistance remarkably. By comparing the accuracy under different noise conditions, it is verified that the proposed model can exhibit a higher accuracy.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Artificial neural networks have drastically impacted many areas in recent years, such as pattern recognition, dependable object detection, and image super-resolution [15]. Considering the situation that the number of several key parameters and connections in neural networks has grown dramatically, the computing capability of traditional central processing units (CPUs) can't meet the demand for rapid or even real-time imaging response. Although some main computing hardwares including IBM's TrueNorth chip, Google's tensor processing units (TPUs), and graphical processing units (GPUs), have been developed fast for further increasing the processing speed of current neural networks, their computing speed is still severely limited by intrinsic electron mobility and operating frequency for acquiring needed information.

As known, the way of using lightwave to process information has presented several impressive advantages such as ultrahigh interconnection speed, broad bandwidth, and low crosstalk, and also demonstrated a possibility of breakthrough the frequency bottleneck of conventional computation hardwares. So far, two kinds of optical neural networks have been proposed. The first relies on silicon photonics, which is utilized to realize matrix multiplication over an optical on-chip platform consisting of several Mech-Zender interferometers [6,7]. The second is physically constructed by stacking isolated diffractive surfaces with functioned phase arrangement so as to cooperatively perform relatively complex optical transformation [8,13,14,17,18,20]. As demonstrated, typical linear transformation can be effectively implemented at light speed so as to perform an ideal object detection at a rate of exceeding 100 GHz by all-optical neural networks based on precise phase modulation [9]. And as a hot topic currently, this field is evolving fast, e.g., faster way to compute backpropagation [26], beam steering technology with the diffractive neural network [27], residual diffractive neural network [28]. All in all, the optical network already exhibits a potential intelligent selection of the weights of superimposed planewave components according to spatial spectrum characteristics of diffractive beams with learning the features of the data acquired.

At present, electrically modulated imaging detection based on prior knowledge or control algorithms has been used extensively. The primary operations include the typical graphic information processing and the adaptive opto-electronic correction based on adjusting both the imaging lightwave parameters such as wavefront, spectrum, wavevector, polarization, or amplitude and the imaging coefficients, for instance, the point spread function or spatial frequency distribution of diffractive imaging lightwaves. Considering the characters of the processing or response speed being still in the system electronic circuit scale and revealing the relatively poor adaptability with respect to intricate environment factors or complex targets, there is an urgent requirement for performing learnable detection through manipulating micro-nano-lightfields constructed by imaging objective lens for efficiently implementing speedy automatic object acquisition, which means an intelligent imaging detection.

The all-optical diffractive neural network (DNN) above-mentioned, which can be utilized to learn the end-to-end mapping using a given training set with a relatively simple sequential structure and thus implement a fast transformation and then recognition of the spatial information of targets, demonstrates an attractive prospect of constructing a new type of intelligent detection architecture by directly coupling the DNN over the photosensitive surface of sensor array. To the DNN based on Huygens-Fresnel principle, most appeared versions rely on computer to implement the backpropagation for optimizing the trainable parameter (layer's amplitude and phase) and then realize the classification or imaging with ultra-low energy consumption according to multilayer diffractive phase plates. Generally, the network parameters obtained by previous work are pure theoretical results without consideration about various errors and thus easily cause the network to be susceptible to disturbances under all optical conditions. To the intelligent detection chip with DNN mentioned, we can expect four categories of processing errors summarized as follows: (1) motion blur and (2) wavefront distortion created by aero-optical effect and (3) lightwave frequency shifting attributed to optical Doppler effect and (4) the structural errors existing in imaging system and core devices, for example, the 3D-printer precision error, alignment error, layer spacing error, and abrasion or arranging error of the phase plates. To mitigate the impact of structural errors mentioned, one can improve hardware allocation precision and also correct data by post-processing like conventional imaging detection. However, the method of executing optical information processing and also optimizing training strategy may be a faster and lower-energy way.

In this paper, a strong robustness diffractive neural network (SRNN) mathematical model for guiding the designing and construction of an intelligent imaging detector array through remarkably mitigating the influence of the structural errors and also lightwave frequency shifting on the DNN output in a non-hardware way, has been proposed. By adding different standard deviations of Gaussian noise to the weights during training, the error distribution of the actual phase mask was simulated for making an optimized DNN more resistant to hardware errors. In other words, the Weight-Noise-Injection training forces the optimal weights to find a minimum region, which is relatively insensitive to errors instead of only performing Stochastic Gradient Descent (SGD) to the minimum value of the initial loss function. It should be noted that network training with noise has been a hot issue since the 1990s and has been applied to many aspects widely, e.g., improving network generalization performance [29,30], data enhancement [31], building denoising autoencoder [32], network label smoothing [33], etc where the Weight-Noise-Injection scheme has been applied to typical neural networks such as Radial Basis Function (RBF) network, Multilayer Perceptron (MLP), and recurrent neural networks to improve the convergence ability and generalization of neural networks [1012,15]. As a method of adding additive noise to the network's hidden unit, it alleviates the over-fitting problem to a certain extent, but it is prone to ill-conditioning. In general, the conventional computer-based neural network's rectified linear hidden unit makes the added noise insignificant by increasing the network weight during the training, i.e., relatively reducing the value of fixed-scale additive noise. Although the scheme is far easier to reduce loss value than forcing the network to look for insensitive regions, this is not a reasonable and universal solution. Currently, it is mainly replaced by the method of adding multiplicative noise (Dropout) [16,22,23]. Our study aims to enhance the resistance of optical neural network to multiple types of environmental errors in reality rather than artificial weight errors [12], i.e., let the optical network maintains high-precision prediction in a complex disturbance environment. In the field of optical neural networks, researchers have proposed a vaccinated D2NN model for improving the robustness of the optical network to alignment errors [19]. While the proposed method in this research is more versatile, the influence of (3) lightwave frequency shifting and (4) structural error to the optical network from the perspective of backpropagation can be obviously reduced so as to lay a concrete foundation for continuously constructing a new type of all-optical intelligent imaging detector array.

2. Proposed method

Generally, the imaging sensors are used to convert the compressed light field into opto-electronic signal arrays leading to final digital target images. To complex objects or background such as typical unstable flow field or even turbulence, an adaptive imaging architecture can be introduced through adding a beam splitter to construct a wavefront measurement light path or even directly coupling a functioned micro-optics structure with the imaging sensors. So, the wavefront adjusting signals are based on prior knowledge or pattern recognition algorithms for acquiring intrinsic object images [25]. The similar operations are already conducted in spectral imaging or polarization imaging. To perform a speedy or even real-time object detection and recognition only by adaptive sensor array corresponding to dynamic or even hypersonic velocity targets in an intricate background, a scheme of constructing an intelligent imaging detection architecture (IIDA) should be a feasible solution based on the current development of the optical neural network, which is illustrated in Fig. 1.

 figure: Fig. 1.

Fig. 1. Schematic diagram of an all-optical IIDA inserted between the objective lens of a conventional imaging system and the detector array utilized, which will be influenced by several critical factors indicated by signs (1), (2), (3) and (4).

Download Full Size | PDF

As shown in this figure, an all-optical IIDA corresponding to a moving target in a dynamic airflow field is constructed by directly inserting an optical neural network between the objective lens and the detector array, which are main functioned components of a conventional imaging system, so as to effectively manipulate micro-nano-lightfields shaped by the objective lens and then continuously perform opto-electronic operations. The purpose of utilizing the all-optical neural network is to effectively cope with complex targets and environment for efficiently realizing intelligent automatic object recognition. In other words, by arranging an optical neural network with learning ability in front of the imaging sensors, which is already trained by noisy samples to alleviate the impact of environmental disturbance, a type of all-optical intelligent detection can be realized by the IIDA. Where signs (1), (2), (3) and (4), indicate the imaging factors related with the basic structure of the IIDA proposed and its main problems to be solved, for instance, the motion blur of the target with the velocity $\vec{v}$, which is a relative velocity of the object with respect to the IIDA, and the wavefront distortion originated from airflow field, and the lightwave frequency shifting described by wavelength variation $\delta l$, and the structural errors of the layer $i$ expressed by $\delta {x_i}$, $\delta {y_i}$, $\delta {z_i}$, which are varied along the coordinate axis-x, -y, and -z, respectively. Where $\delta {h_i}$ represents the 3D-printing error of the layer $i$ along the direction of axis-z. In this paper, we only discuss the influence of the errors (3) and (4).

In this paper, we use a purely phase-modulated optical network to classify target data under the condition of existing different kinds of disturbance. To this diffractive neural network, the propagation of lightwaves can be described as follows [8]:

$$m_i^l = {e^{\textrm{j}{\varphi _2}}},$$
$$w_i^l = \frac{{{z^l} - {z^{l - 1}}}}{{{r^2}}}(\frac{1}{{2\mathrm{\pi }r}} + \frac{1}{{\textrm{j}\lambda }})\textrm{exp} (\frac{{\textrm{j}2\mathrm{\pi }r}}{\lambda }),$$
$$r = \sqrt {{{(x_i^l - x_j^{l - 1})}^2} + {{(y_i^l - y_j^{l - 1})}^2} + {{(z_i^l - z_j^{l - 1})}^2}} ,$$
$$output_i^l = w_i^l \times \sum {_k} output_k^{l - 1} \times m_i^l = w_i^l \times |A |{e^{\textrm{j}{\varphi _1}}} \times {e^{\textrm{j}{\varphi _2}}} = |{{A_w}} |{e^{\textrm{j}\Delta \varphi }},$$
where $m_i^l$ is the modulation of $i$-th neuron of layer l, and ${\varphi _\textrm{2}}$ represents the phase change of the lightwave caused by the phase plate, and $w_i^l$ is the Rayleigh-Sommerfeld diffraction between neuron in layer l and neuron in layer $l - 1$, and ${z^l}$ denotes coordinate of the $l$-th layer on the axis-z, and r denotes the distance between $i$-th neuron of layer l and $j$-th neuron of layer $l - 1$, and $output_i^l$ is the output lightwave of $i$-th neuron of layer l, and ${\varphi _\textrm{1}}$ represents the phase change of the lightwave due to diffraction, and A is the amplitude of the output lightwave for layer $l - 1$, and ${A_w}$ is the amplitude of the output lightwave for layer l.

Figure 2 illustrates a forward propagation process of the DNN in both the simulation and the experiments executed actually, where a standard Fourier Transform is used to calculate the parameters of the diffractive process based on the spatial spectrum distribution of diffractive beams in simulations. Parameter ${m^l}$ denotes the $l$-th diffractive layer. H denotes a constant diffraction transfer function representing a mathematic computation according to the Rayleigh-Sommerfeld (R-S) diffraction formula [24]. It should be noted that in actual calculations, we use the spatial spectrum method to encode Fourier-domain inputs for calculating the complex amplitude distribution, which makes the training more efficient [8,21]. Each green arrow indicates a calculation based on the spatial spectrum method for shaping a complex amplitude distribution over the next diffractive plane. In fact, an actual diffractive neural network generally uses a Gaussian light source to project light beams over object lightwave modulation masks for constructing a final lightfield or a light intensity image based on complex amplitude distribution shaped. The input lightwaves are firstly diffracted in free space and then coded by each diffractive layer established to achieve a classification or an imaging operation. Besides, the noteworthy difference between both processes is that only a noise-free light propagation process is simulated in previous works, which does not contain the consideration of the static or dynamic structural errors of the imaging system and light frequency shifting produced in an actual object detection and imaging process.

 figure: Fig. 2.

Fig. 2. DNN simulation and optical arrangement for actually experimental process.

Download Full Size | PDF

2.1 Selection of loss function

We use a cross-entropy as the loss function of the diffractive neural network.

$$L(p,q) ={-} \sum\limits_x {(p(x)\log q(x) + (1 - p(x))\log (1 - q(x)))} ,$$
where $p$ is the expected probability distribution, and q is the probability distribution of the network output. Compared with the mean squared error loss function, the cross-entropy loss function is more conducive to handling classification problems [16]. It hopes that the probability distribution of the options excepting the correct choice will be more uniform and also tend to a smaller value, which further reduces the probability of a wrong option with high voting. In other words, the ideal situation should be to maximize ground truth while also making the proportion of other options more average, instead of focusing on a certain error option.

2.2 Injection weight noise strategy for DNN

Considering the situation that a diffractive neural network can be trained using an injection weight noise strategy, the network update equation will be given by

$$\phi (s + 1) = \phi (s) - \mu \frac{{\partial ({y_t} - output({x_t},{\phi _n}(s)))}}{{\partial \phi }},$$
where $\phi $ as a phase value of diffractive mask is a learnable parameter, and ${\phi _n}$ is the noise phase, which can be expressed as ${\phi _n}(s) = \phi (s) + noise$, $s$ denotes $s$-th iteration of the neural network and $({x_t},{y_t})$ is the training set and $\mu $ represents learning rate. Using a fixed 3D printing structure, this update equation can be run on the computer instead of implementing by optical components.

Figure 3 shows an SRNN backpropagation model, which utilizing Adam algorithm, an improved version of the Stochastic Gradient Descent algorithm with adaptive learning rate, to train the network. Compared with conventional DNN, the proposed model needs to add a certain proportion of real random noise matrix to each diffractive layer based on Gaussian distribution when using the data set for performing network training. Generally, most of the noise categories can be approximated as Gaussian distribution. Hence, we put Gaussian noise into network weight to increase the robustness. Of course, other errors can be inserted into the weight, but they are not necessarily as universal as Gaussian noise. The essential reason is that Gaussian noise causes the network weight's optimization target to change, which brings about the network to be resilient to disturbance such as $\delta {x_i}$, $\delta {y_i}$, $\delta {z_i}$, $\delta l$. For each iteration, these noise matrices are randomly and independently generated and put into the network weights, so as to add an inevitable component of randomness to the phase modulation of each layer, which will cause an output deviation of the diffractive neural network until the network improves its adaptability to random disturbances. It is verified by simulations that this training method already pushes the model into the vicinity of the minimum value surrounded by a gentle area, which has pros and cons qualitatively. It sacrifices little accuracy under pure weight to make the optical network robust in disturbance environments, which becomes obvious as the variance of training noise increasing, and the network's weight will be insensitive to various errors. In applications, with estimating the actual disturbances in the environment, one can select the noise with appropriate variance to train the network, so as to maximize the robustness of the network and reduce the unnecessary loss of accuracy.

 figure: Fig. 3.

Fig. 3. SRNN backpropagation model

Download Full Size | PDF

3. Results and discussion

Figure 4 demonstrates a process of training DNN and SRNN separately using the MNIST. These neural networks are implemented using Python and the framework of Tensorflow (r2.30 version) on a PC with Intel Core i7-9700H CPU (3.00 GHz), 48 GB of RAM, and the GeForce RTX 2070 (NVIDIA). The number in the label of SRNN represents the standard deviation (Std) of the random noise injected to the model during the backpropagation. For example, SRNN (0.5) means the outcome of the strong robustness diffractive network with injected Std=0.5 Gaussian noise during training. It should be noted that although the standard deviations are distinct, the mean of all random matrices is zero, so as to ensure the validity of the comparison. According to this network training line chart, the more significant standard deviation of the added noise, the slower the network training speed and the lower accuracy. For instance, the blue line (Std=0.7) requires four epochs to reach 80% accuracy, while the green line (Std=0.5) only requires two epochs. This is because that the more considerable noise will inject severer randomness into the network output at the training beginning until the weight of the neural network entering a flat area, which is weakly affected by small changes. In a sense, the network has learned how to remove the impact of the injected noise.

 figure: Fig. 4.

Fig. 4. The accuracy characters of DNN and SRNN on MNIST

Download Full Size | PDF

To ensure the rigor of subsequent experiments, we train each neural network 20 epochs with 55000 images (5000 validation images) from the MNIST (Modified National Institute of Standards and Technology) handwritten digit database. The number of neurons in each layer is $\textrm{200} \times \textrm{200}$ with a spacing of 3 cm and a 400 GHz light source is used. After training, the design of the optical network classifier is numerically tested using 10,000 images from the MNIST test dataset and proves that SRNN achieves better results under disturbances than DNN, e.g., SRNN achieves a classification accuracy of over 80% while the DNN only achieves the 57.5% accuracy with 0.5 mm z-axis precision. Although a more significant number of layers can provide higher expressive ability for the network, the optical neural network does not have a shortcut connection like the residual network, which brings about the weights of the front layers hardly to change, leading to a lousy training result. Therefore, the shallower optical network (2–5 layers) is used in this research. Within the layered range, whether or not utilizing the Weight-Noise-Injection training mode, the influence between the system errors and the number of diffractive layers is tiny, and the classification accuracy can be increased with the growth in the number of network layers.

Figure 5(a) shows the prediction of trained SRNN(0.5) for digital 9. As an approximate method of scalar diffraction theory, the Rayleigh-Sommerfeld diffraction formula calculates the approximate light field by treating the lightwave as the scalar wave. At the same time, using the Poisson surface reconstruction to process the optimized phase matrix will also produce a certain error [8]. So, we take the above factors into account and show the precise lightfield distribution calculation results by using Maxwell's equations. In the future, we will further study how to narrow the gap between the two results.

 figure: Fig. 5.

Fig. 5. Classification of MNIST under different disturbances. (a) Comparing the calculation results of the R-Y diffraction equation and Maxwell's equation. (b)Two networks’ classification results under different disturbances. (c) Networks under varying levels of frequency shift.

Download Full Size | PDF

As shown in Fig. 5(b), both the DNN and SRNN are trained using the same data to transform the spatial information of digital seven for recognition. Based on two diffractive network model predictions, we compare the classification results under the structural error with utilizing the pure weights, the weights added with Gaussian noise, and the weights constrained by z-axis precision. Through using a $1 \times 10$ vector ${[{{d_0},{d_1},{d_2},\ldots ,{d_9}} ]^T}$ representing one-hot format voting data of ten categories from digital zero to digital nine, where the overly similar grids are marked with a specific proportion, the resistance of both models corresponding to errors mentioned is investigated carefully. Distinctly, SRNN's classification prediction for input digital seven under the pure weight presents a slightly worse performance compared with the results of DNN, where the two highest votes 14.6% for seven and 14.1% for nine are relatively close. However, SRNN shows an impressive prediction performance under various disturbance factors, which is caused by SRNN's weight staying at a flat minimum region. Figure 5(c) shows the output of different network models under varying levels of frequency shift. To reflect that frequency shift is an external error different from the previous network structure errors, we probed it independently. In this process, the network we used still defaulted to a 400 GHz frequency, and we did not re-train the network of other frequencies or train a particular network to handle the frequency shift problem. For the linear network structure, the external error will be directly reflected in the H matrix (scalar diffraction transfer function, as shown in Fig. 2), which can be seen as being transferred to the weight matrix in process of using the spatial spectrum method to calculate the output light field, causing SRNN to be robust to frequency shift. As a serious disturbance, the conventional diffraction neural network and SRNN trained with small variance noise have almost no ability to deal with frequency error; however, as the robustness of SRNN enhancing, the correct option is becoming apparent gradually. All in all, Our results demonstrate that once the original weight is changed, the conventional optical network's prediction for the input will become mighty unstable while our proposed model maintains the smart brain and resilient prediction ability under those disturbances.

In Fig. 6, all histograms show the effect of different errors on trained DNN and SRNN by simulation prediction results. In terms of the robustness of the network, we use the network prediction accuracy under certain error tolerance to show the effectiveness of the proposed method. In Fig. 6(a), we inject the Gaussian matrix generated by the same random seed to each network weights and then calculate the classification accuracy for eliminating the doubt according to the situation shown in Fig. 5 as a particular case. For each standard deviation, 120 Gaussian matrices, which are generated independently and randomly and change every epoch, are selected for five layers of networks to get 24 test set accuracy and the classification accuracy for comparison is the average value of 24 results. Meanwhile, we use the error bars to represent the standard deviation of the distribution of network output. It can be observed that SRNN is injected with a Gaussian random matrix during the training process, which lets it to be less affected by noise. On the contrary, the network prediction is on the brink of random classification, when the noise's standard deviation injected into DNN reaches the value 1. In Fig. 6(b), we further analyze the effect of 3D printing z-axis precision on the optical networks. It is known that the hardware implementation of the optical diffractive neural network relies on 3D printing layer computer-optimized, which means that the phase modulation capability of each layer is severely affected by the z-axis precision of the 3D printer utilized. In the experiments, several z-axis precision grades labeled by 0.1 mm, 0.2 mm, 0.4 mm, and 0.5 mm, is used for testing the prediction results of the networks. It can be seen that SRNN (0.3) has a slightly lower classification accuracy than DNN (only reduce by 2%) at ultra-high z-axis precision with expensive production cost and lower fault tolerance. However, it shows an impressive classification accuracy at a lower precision. Actually, the SRNN training is injected with random Gaussian noise, but also has notable resistance to the 3D printing errors, which don't belong to the Gaussian distribution. The best proof is the accuracy of DNN prediction to the grade of 0.5 mm is decreased by 35.4% compared to that of 0.1mm, while SRNN(0.3) is only reduced by 8.2%. The experimental results exhibit that it is possible to use a low-precision 3D printer to make a lower-cost optical diffractive neural network with high prediction accuracy, which presents conspicuous robustness to random noise not to be limited to 3D printing errors. Figures 6(c) and 6(d) further illustrate that the injection Gaussian noise training is a generalization method and demonstrates a change impact in lightwave frequency and layer spacing according to experiments. It should be emphasized that these networks are trained under the conditions of a 3 cm layer spacing and 400 GHz lightwave frequency. In Fig. 6(d), (a) random (2.9, 3.1) means that five values generated randomly in the range from 2.9 to 3.1 is set as the layer spacing. And the Fig. 6(d) ‘s error bars come from the standard deviations of the network output impacted by the randomly generating layer spacing.

 figure: Fig. 6.

Fig. 6. Accuracy characters of the optical classification under various disturbances. (a) Adding the Gaussian noise to the trained diffractive network weights. (b) Constrain the weights with 3D-printer z-axis precision. (c) Changing the lightwave frequency for the optical networks. (d) Changing the layer spacing of the optical networks. (e) Confuse matrix of the networks under 0.4 mm Z-axis precision.

Download Full Size | PDF

Figure 7 illustrates the diffractive neural network, which can be formed by 3D printing method, where the thickness of the diffraction area can be selected in a range from 0.5 to 1.0 mm. For preventing the deformation of the phase plate during transportation or installation operation, the thickness of installation area is set at the value of 2.5 mm. Due to the precision of the 3D printing process, the mask of (b) is the sum of the ideal optimized mask and the processing error [8]. The weight noise injection strategy is similar to the multiplying noise strategy of Dropout, which is to inject a fixed-scale additive noise into the network, they can all be regarded as a form of damage to the network structure, instead of the original input value. Generally, the application of Weight-Noise-Injection on standard networks will ill-conditioned prompt the network weights becomes large to offset the fixed-scale additive noise (relatively reduce injection noise). Nevertheless, as shown in Fig. 7(d), where Height(m) has a fixed linear relationship with the network weights. The profile denotes the line p in Fig. 7(b) with the distinct network model of layer 3, which illustrates the measurement and transformation law of network weights. Based on Fig. 7(d), we already verified that for the special network structure without a linear rectification unit, such as DNN, Weight-Noise-Injection does not bring about ill-conditioning, which means that the weight noise injection is completely valid for improving the robustness of the unconventional network structure by forcing the optimal weights to find a minimum region being relatively insensitive to errors.

 figure: Fig. 7.

Fig. 7. Typical device structural parameters and diffractive network weight polyline. (a) Typical structure of the optical neural network. (b) The third phase plate of the network. (c) Surface details of the phase plate. (d) Network layer's polyline.

Download Full Size | PDF

To further investigate the potential role of 3D printing in causing diffractive network prediction error, the variance in network weights with constant adjusting z-axis precision is illustrated in Fig. 8. The optical neural network shaped by 3D printing is undoubtedly affected by z-axis precision being different from Gaussian noise, which is an error similar to the filtering operation that can effectively mitigate the high-frequency signal influence. With the precision decreasing, the subtle structure of polyline keeps vanishing, for the 3D printer erasing the unrecognized altitude difference. As shown in Fig. 8(c), most of the structural information in an optical network has been erased in relatively low printing precision, which means the network needs to keep its effective information from being assimilated as much as possible to achieve better classification. It is easy to see the apparent diffraction pattern from the origin phase, which shows an unprocessed natural phase. With low-precision processing, the layout of the DNN layer using to transform the optical spatial information is almost effaced, while the outline of the SRNN phase plate is well preserved, so as to reverify that our proposed model is more robust than conventional optical network model.

 figure: Fig. 8.

Fig. 8. Optical network weights under different 3D printer z-axis precision. (a) Polyline of the DNN under different Z-axis precision. (b) Polyline of the SRNN under different Z-axis precision. (c) The third phase mask of DNN and SRNN under different Z-axis precision (0.2 mm, 0.5 mm).

Download Full Size | PDF

4. Conclusion

An SRNN model for guiding the designing and construction of an all-optical IIDA by remarkably mitigating the influence of the typical structure errors and also lightwave frequency shifting on DNN output in a non-hardware way has been proposed. Through judging the effects of various disturbances on the SRNN and conventional optical network, it is verified that the proposed optical network model demonstrates satisfactory robustness corresponding to a relatively complex noise environment. The established SRNN model already presents an efficiency of remarkably improving the optical network noise resistance based on the existing 3D printer parameters and Terahertz lightwave characteristic, so as to lay a foundation for continuously constructing the next-generation all-optical intelligent imaging detector arrays.

Funding

National Natural Science Foundation of China (61176052, 61432007, 61821003); China Aerospace Science and Technology Innovation Fund (CASC2015).

Disclosures

The authors declare that there are no conflicts of interest.

References

1. H. Kaiming, X Zhang, and S. Ren, “Deep Residual Learning for Image Recognition,” arXiv preprint arXiv: 1512.03385 (2015).

2. K Simonyan and A Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” arXiv preprint arXiv: 1409.1556 (2015).

3. G. Giacinto and F. Roli, “Design of effective neural network ensembles for image classification processes,” Image Vis. Comput. J. 19(9-10), 699–707 (2001). [CrossRef]  

4. AG Howard, “Some Improvements on Deep Convolutional Neural Network Based Image Classification,” Computer Science (2013).

5. Z Zhang, Z Wang, and Z Lin, “Image Super-Resolution by Neural Texture Transfer,” CVPR (2019).

6. T. W. Hughes, M. Minkov, Y. Shi, and S. Fan, “Training of photonic neural networks through in situ backpropagation and gradient measurement,” Optica 5(7), 864–871 (2018). [CrossRef]  

7. R. Hamerly, L. Bernstein, and A. Sludds, “Large-Scale Optical Neural Networks Based on Photoelectric Multiplication,” Phys. Rev. X 9(2), 021032 (2019). [CrossRef]  

8. X Lin, Y Rivenson, NT Yardimei, M Veli, and Y Luo, “All-optical machine learning using diffractive deep neural networks,” Science 361(6406), 1004–1008 (2018). [CrossRef]  

9. L. Vivien, A. Polzer, D. Marris-Morini, J. Osmond, and J. Hartmann, “Zero-bias 40gbit/s germanium waveguide photodetector on silicon,” Opt. Express 20(2), 1096–1101 (2012). [CrossRef]  

10. S. Hochreiter and J. Schmidhuber, “Simplifying neural nets by discovering flat minima,” In Advances in Neural Information Processing Systems, 529–536 (1995).

11. LeungCS HoK and J Sum, “On Weight-Noise-Injection Training,” Advances in Neuro-information Processing International Conference (2008).

12. L. Holmstrom and P. Koistinen, “Using additive noise in backpropagation training,” IEEE Trans. Neural Netw. 3(1), 24–38 (1992). [CrossRef]  

13. Y Zuo, B Li, Y Zhao, Y Jiang, and Y Chen, “All Optical Neural Network with Nonlinear Activation Functions,” arXiv preprint arXiv:1904.10819 (2019).

14. L. Jingxi, M. Deniz, L. Yi, and R. Yair, “Class-specific differential detection in diffractive optical neural networks improve inference accuracy,” Adv. Photonics 1(04), 1 (2019). [CrossRef]  

15. A. F. Murray and P. J. Edwards, “Synaptic weight noise during multilayer perceptron training: fault tolerance and training improvements,” IEEE Trans. Neural Netw. 4(4), 722–725 (1993). [CrossRef]  

16. Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning. (Massachusetts Institute of Technology, 2016).

17. T. Yan, J. Wu, T. Zhou, and H. Xie, “Fourier-space diffractive deep neural network,” Phys. Rev. Lett. 123(2), 023901 (2019). [CrossRef]  

18. Y. Luo, D. Mengu, and T. Yardimci N, “Design of Task-Specific Optical Systems Using Broadband Diffractive Neural Networks,” Light: Sci. Appl. 8(1), 112 (2019). [CrossRef]  

19. D Mengu, Y Zhao, and T Yardimci N, “Misalignment Resilient Diffractive Optical Networks,” arXiv preprint arXiv: 2005.11464 (2020).

20. Y. Chen and J. Zhu, “An optical diffractive deep neural network with multiple frequency-channels,” arXiv preprint arXiv: 1912.10730 (2019).

21. Y. Chen, “Express Wavenet - a low parameter optical neural network with random shift wavelet pattern,” arXiv preprint arXiv: 2001.01458 (2020).

22. N Srivastava, G Hinton, and A Krizhevsky, “A Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research 15(1), 1929–1958 (2014). [CrossRef]  

23. X Bouthillier, K Konda, and P Vincent, “Dropout as data augmentation,” arXiv preprint arXiv:1506.08700 (2015).

24. J. W. Goodman, Introduction to Fourier Optics. (Roberts and Company, 2005).

25. Q. Tong, Y. Lei, Z. Xin, X. Zhang, and H. Sang, “Dual-mode photosensitive arrays based on the integration of liquid crystal microlenses and CMOS sensors for obtaining the intensity images and wavefronts of objects,” Opt. Express 24(3), 1903–1923 (2016). [CrossRef]  

26. Z. Tiankuang, F. Lu, Y. Tao, W. Jiamin, L. Yipeng, F. Jingtao, W. Huaqiang, L. Xing, and D. Qionghai, “In situ optical backpropagation training of diffractive optical neural networks,” Photonics Res. 8(6), 940–953 (2020). [CrossRef]  

27. I. U. Idehenre and M. S. Mills, “Multi-directional beam steering using diffractive neural networks,” Opt. Express 28(18), 25915–25934 (2020). [CrossRef]  

28. D. Hongkun, D. Yue, Y. Tao, W. Huaqiang, L. Xing, and D. Qionghai, “Residual D2NN: training diffractive deep neural networks via learnable light shortcuts,” Opt. Lett. 45(10), 2688–2691 (2020). [CrossRef]  

29. C. Bishop, “Training with Noise is Equivalent to Tikhonov Regularization,” Neural Comput. 7(1), 108–116 (1995). [CrossRef]  

30. G. An, “The Effects of Adding Noise During Backpropagation Training on a Generalization Performance,” Neural Comput. 8(3), 643–674 (1996). [CrossRef]  

31. J. Sietsma and R. Dow, “Creating artificial neural networks that generalize,” Neural Networks 4(1), 67–79 (1991). [CrossRef]  

32. P Vincent, H Larochelle, and Y Bengio, “Extracting and composing robust features with denoising autoencoders,” International Conference on Machine Learning, 1096–1103 (2008).

33. C Szegedy, V Vanhoucke, and S Ioffe, “Rethinking the Inception Architecture for Computer Vision,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2818–2826 (2016).

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. Schematic diagram of an all-optical IIDA inserted between the objective lens of a conventional imaging system and the detector array utilized, which will be influenced by several critical factors indicated by signs (1), (2), (3) and (4).
Fig. 2.
Fig. 2. DNN simulation and optical arrangement for actually experimental process.
Fig. 3.
Fig. 3. SRNN backpropagation model
Fig. 4.
Fig. 4. The accuracy characters of DNN and SRNN on MNIST
Fig. 5.
Fig. 5. Classification of MNIST under different disturbances. (a) Comparing the calculation results of the R-Y diffraction equation and Maxwell's equation. (b)Two networks’ classification results under different disturbances. (c) Networks under varying levels of frequency shift.
Fig. 6.
Fig. 6. Accuracy characters of the optical classification under various disturbances. (a) Adding the Gaussian noise to the trained diffractive network weights. (b) Constrain the weights with 3D-printer z-axis precision. (c) Changing the lightwave frequency for the optical networks. (d) Changing the layer spacing of the optical networks. (e) Confuse matrix of the networks under 0.4 mm Z-axis precision.
Fig. 7.
Fig. 7. Typical device structural parameters and diffractive network weight polyline. (a) Typical structure of the optical neural network. (b) The third phase plate of the network. (c) Surface details of the phase plate. (d) Network layer's polyline.
Fig. 8.
Fig. 8. Optical network weights under different 3D printer z-axis precision. (a) Polyline of the DNN under different Z-axis precision. (b) Polyline of the SRNN under different Z-axis precision. (c) The third phase mask of DNN and SRNN under different Z-axis precision (0.2 mm, 0.5 mm).

Equations (6)

Equations on this page are rendered with MathJax. Learn more.

m i l = e j φ 2 ,
w i l = z l z l 1 r 2 ( 1 2 π r + 1 j λ ) exp ( j 2 π r λ ) ,
r = ( x i l x j l 1 ) 2 + ( y i l y j l 1 ) 2 + ( z i l z j l 1 ) 2 ,
o u t p u t i l = w i l × k o u t p u t k l 1 × m i l = w i l × | A | e j φ 1 × e j φ 2 = | A w | e j Δ φ ,
L ( p , q ) = x ( p ( x ) log q ( x ) + ( 1 p ( x ) ) log ( 1 q ( x ) ) ) ,
ϕ ( s + 1 ) = ϕ ( s ) μ ( y t o u t p u t ( x t , ϕ n ( s ) ) ) ϕ ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.