Photoelectric hybrid neural network based on ZnO nematic liquid crystal microlens array for hyperspectral imaging

Hui Li; Hui Li; Tian Li; Tian Li; Si Chen; Si Chen; Yuntao Wu; Yuntao Wu; Yuntao Wu

doi:10.1364/OE.482498

1. Introduction

The spectrometer has become one of the essential scientific research instruments, widely used in agriculture, astronomy, biology, chemistry, food, and other fields [1–3].

Traditional spectrometers with dispersive optical elements are usually bulky and inconvenient, not in miniaturized scenes. Thus, in computational spectral imaging technology, pure algorithms come into being to solve the miniaturization issue [4], such as computer tomography imaging spectrometer (CTIS) [5], coded aperture transient spectral imager (CASSI) [6,7], and tomographic spectral imaging (CTHIS) [8]. However, some reconstruction concerns exist for the traditional computational spectral reconstruction method, such as enormous calculation burden, low spectral quality, and low spatial resolution. In recent years, a convolutional neural network (CNN) for spectral reconstruction has gradually emerged with the development of deep learning technology. Representative networks include HSCNN+, AWAN, and MST++ [9–11]. These networks are also pure algorithms, not photoelectric hybrid neural networks, that have achieved spectral image reconstruction, 512 × 482 pixels, and a 10 nm spectral interval in the 400-700 nm range.

With the increase of numerous novel optical elements, spectral reconstruction of software and hardware collaboration has entered a new field. Nanjing University has recently reconstructed the spectral image in the visible light range using an achromatic metalens array and CNN network [12]. Zhejiang University used deep learning technology to design a filter with 16 random bands combined with the CNN network. It realized 1 nm 400-700 nm spectral reconstruction [13]. It is possible to reconstruct spectral images using novel optical elements combined with deep learning technology. In addition, with the excellent photoelectric characteristics of liquid crystal (LC), LC microlens array (MLA), an electrically controlled tunable focus element, is also a novel optical imaging device, successfully applied in several optical imaging applications [14–16]. Based on Levoy’s light field theory, the LC-MLA can be applied to acquire light field information of the scene [17], which can be got an extended depth of field (DOF) compared to the conventional glass-type optical MLA [18]. The plenoptic function acquired by light field theory is $L(x,y,u,v,\lambda ,t)$ [19]. The spectral information can be decoupled from the plenoptic function. Therefore, hyperspectral imaging can be theoretically acquired by LC-MLA via deep learning. In addition, Zinc oxide (ZnO) is a novel inorganic substance with high transparency and good conductivity in the visible wavelength range compared to other oxide materials. With its advantages, it has been widely used in LC devices, solar energy, and other fields [20–24]. To an extent, it could replace indium tin oxide (ITO) in fabricating LC devices.

In this study, the designed optoelectronic hybrid network based on ZnO LC-MLA has achieved 1536 × 1536 pixels resolution, 1 nm interval accuracy, and 400-700 nm wavelength range. The key element of the hybrid network is ZnO LC-MLA. It was a convolution kernel to reduce the network volume and improve the network speed. The proposed hybrid network has a resolution enhancement for hyperspectral imaging. The reconstruction results reach high luminous flux with high imaging SNR. The hybrid network is stable and reliable without spectral mismatch, geometric distortion, and scanning elements.

2. Theory and principle

2.1 Spectral reconstruction theory via convex optimization

The light field acquisition mainly adopts the biplane model with MLA. Each point on the detector image plane receives the integral of all rays from the whole pupil [25].

With the ZnO LC-MLA, the obtained light field has an extension DOF compared to the conventional glass type. It has the major advantage of electrically tunable focal length. To fully use ZnO LC-MLA, the light field information was randomly selected three times of electronically controlled focus to construct all-in-focus information. The objective function of ZnO LC-MLA is,

(1)$$\widetilde l = \mathop {\arg \min }\limits_l {E_d}(l) + \gamma {E_m}(l).$$

where, ${E_d}(l)$ is the data items. It has,

(2)$$\begin{aligned} {E_d}(l) = ||\int\!\!\!\int {{h_{\sigma V_1^\ast }}l(x,y,\lambda )dxdy - {f_{V_1^\ast }}(x,y,\lambda )} ||_2^2 + \\ ||\int\!\!\!\int {{h_{\sigma V_2^\ast }}l(x,y,\lambda )dxdy - {f_{V_2^\ast }}(x,y,\lambda )} ||_2^2 + \\ ||\int\!\!\!\int {{h_{\sigma V_3^\ast }}l(x,y,\lambda )dxdy - {f_{V_3^\ast }}(x,y,\lambda )} ||_2^2. \end{aligned}$$

where, ${h_{\sigma v_i^\ast }}$ is the Gaussian convolution kernel acting on $l(x,y,\lambda )$. Thus, the local blurred image can be obtained. The fuzzy area is $\sigma = k \cdot b$. The relationship between the blur radius and the ZnO LC-MLA is $b \propto {f_{LC - MLA}}({V_i})$. Random voltage can be expressed as $V_i^\ast{=} rand({V_i})$, $i \in \{ 1,2, \cdot{\cdot} \cdot ,N\}$.

${E_m}(l)$ is a regularization term, and $\gamma$ is the coefficient of the regularization term. Then,

(3)$${E_m}(l) = ||\nabla l(x,y,\lambda )||_2^2.$$

where, $\nabla$ represents the gradient calculation operator.

Then, the spectral image of each wavelength is set as $f(x,y,\lambda )$. For simplicity, if the radial magnification in the subsequent optical system is 1, the image on the focal plane and the aperture has the same spatial coordinate system. The ZnO LC-MLA generates dispersion. Its offset on the focal plane is,

(4)$$\rho (\lambda ) = \sqrt {\delta _x^2 + \delta _y^2} .$$

where ${\delta _x} = \Delta {n_x}(\lambda ) = \Delta {n_\parallel }(\lambda )$, ${\delta _y} = \Delta {n_y}(\lambda ) = \Delta {n_ \bot }(\lambda )$. The transmittance function of ZnO LC-MLA is [26],

(5)$$t(x,y) = {\sin ^2}2\alpha {\sin ^2}\frac{{d[{n_o}(\lambda ) - {n_e}(\lambda )]}}{\lambda }\pi .$$

where, $\alpha$ is the included angle between the optical axis of LC and the transmission axis of the polarizer, d means the thickness of the LC layer, ${n_o}(\lambda )$ denotes the refractive index of ordinary light, and ${n_e}(\lambda )$ refers to the refractive index of extraordinary light. Since the transmittance of ZnO LC-MLA is determined by the voltage, its transmittance function has T(V)=t(x,y). After passing through the ZnO LC-MLA, then the spectral image $f^{\prime}(x,y,\lambda )$ is $f(x - {\delta _x},y - {\delta _y},\lambda ) \cdot t(x,y).$

Assuming that the pixel size of the ZnO LC-MLA is the same as the focal plane pixel size, the spectral image $f^{\prime}(x,y,\lambda )$ is on the detection plane. The integral region of 400-700 nm range at a unit size is,

(6)$$I = \int\!\!\!\int\!\!\!\int {f^{\prime}(x,y,\lambda )} dxdyd\lambda .$$

Equation (6) has another expression, $I = \int\!\!\!\int {dxdy\int {f^{\prime}(x,y,\lambda )} } d\lambda .$ The $\int\!\!\!\int {dxdy\int {f^{\prime}(x,y,\lambda )} } d\lambda$ can be recorded as $g(x,y)$. According to the integral mean value theorem, there is $I\sim g(n,m)$,where m and n mean discrete variables.

Because the transmittance function is $t(x,y) = {e^{j\phi (x,y)}}$, where $\phi (x,y)$ refers to the phase transformation function of ZnO LC-MLA [27]. Then, $I \propto |{\cal F}\{{f(x - {\delta_x},y - {\delta_y},\lambda ) \cdot t(x,y)} \}{|^2}.$ If the original spectral image $f(x,y,\lambda )$ is non-aliasing sampling, and its discretization expression is $f(m,n,l)$.At CCD, the integral g has $g = \phi \cdot f.$ Where, $\phi$ means the conversion matrix. The obtained values at the CCD imaging plane are discrete integral by down-sampling.

To obtain the reconstructed spectral image, the objective function is as follows,

(7)$$\tilde{f} = \mathop {\arg \min }\limits_f \frac{1}{2}||g - \phi \cdot f||_2^2 + \alpha ||W|{|_1} + \beta ||f|{|_{TV}},\textrm{s}\textrm{.t}\textrm{.}\,f \ge 0.$$

where $\alpha$ and $\beta$ are the regularization coefficient, used to control data constraints.$||\cdot |{|_2}$, $||\cdot |{|_1}$,and $||\cdot |{|_{TV}}$ are L2 norm, L1 norm, and total variation (TV) norm.

L1 norm is,

(8)$$||W|{|_1} = \sum\limits_i {||l(x,y,\lambda )|{|_1}} .$$

TV norm is,

(9)$$||f|{|_{TV}} = \sum\limits_i {||{D_i}f|{|_2}} .$$

where, ${D_i}$ represents the first-order differential.

This study uses the alternating optimization algorithm to solve the above TV-L1-L2 problem,

(10)$${f^{n + 1}} \leftarrow {f^n} + {\alpha _n}{S^n}.$$

where, ${S^n}$ is the search direction, ${\alpha _n}$ means the step factor.

2.2 Hyperspectral reconstruction architecture

The proposed spectral reconstruction architecture mainly comprised an optical neural network and an electronic neural network, as shown in Fig. 1. The optical neural network contained objective lens, polarizer, ZnO LC-MLA, and CCD. The electronic neural network, classic CNN structure, had seven 3*3 convolution layers, PReLU selected as activation function, and self-attention mechanism module.

Fig. 1. Schematic diagram of photoelectric hybrid neural network based on ZnO LC-MLA. The optical network part of the network was composed of primary lens, polarizer, ZnO LC-MLA, and CCD. Currently, the ZnO LC-MLA was at a single voltage. The electronic network part was CNN. The designed network’s output result was 24 × 24 sub-views of 301 bands. The obtained light field data were 400-700 nm, and the spectral accuracy was 1 nm.

Download Full Size | PDF

We have proposed an optoelectronic hybrid neural network architecture with ZnO LC-MLA for spectral reconstruction. The optical link, image signal, and back-end network had smooth connections, and optimization reduced the amount of calculation of the whole network with the high accuracy of spectral reconstruction. Our proposed architecture has realized the “optical” computing function compared to the classic electronic neural network. It did not use a separate dispersion device. Due to the use of ZnO LC-MLA to collect the light field, this architecture could work under natural light. The resolution enhancement hyperspectral information about the target object could be obtained by decoupling the light field data.

We have proposed an optical system that uses a convolution kernel, ZnO LC-MLA, adapting to CCD. The light from the object fell on the sensor through the ZnO LC-MLA. The light intensity value g(n,m) was the integral value of the corresponding point on the object at the wavelength of 400-700 nm. The convolution relationship was the neural network calculation with the ZnO LC-MLA, as shown in Fig. 2. The ZnO LC-MLA replaced the first convolution layer of the electronic neural network. Then, the intermediate result was used as the input of the remaining layers of the proposed CNN.

Fig. 2. Schematic diagram of ZnO LC-MLA realizing convolution. The ZnO LC-MLA transmittance function T(V) was as a convolution kernel. After the spectral information of the target object passed through the ZnO LC-MLA, the realized operation was convolution. The optical calculation was finally to extract the feature information of the target object’s spectrum.

Download Full Size | PDF

For the proposed architecture, the values of the convolution kernel were the mapping values from the luminous flux distribution of the ZnO LC-MLA. The size of the convolution kernel, the same as the general convolution layer, was 3*3. The mapped convolution kernel weights ranged from 0 to 1. Moreover, the neighboring convolution kernels had similar weights because of the optical consistency in the ZnO LC-MLA.

The proposed network was an end-to-end architecture, from the ray in the actual scene to the reconstruction of hyperspectral data in the network. A convolution layer and a self-attention mechanism module were the main components of the architecture. The ZnO LC-MLA realized the first convolution layer. The ZnO LC-MLA was the core of our proposed architecture. This architecture removed the first convolution layer of the CNN and transplanted it into the optical domain to optimize and simplify the architecture. In order to maximize the spectral accuracy of the network, we used the self-attention mechanism module to optimize the parameters of the electronic neural network.

The optoelectronic hybrid network included an optical convolution layer and an electronic neural network. The optimized parameters were the convolution kernel of weight and bias term in the electronic neural network. The loss function of the network used a mean square deviation to optimize the network.

2.3 ZnO LC-MLA fabrication

Since conventional friction orientation requires polyimide (PI) as the alignment film, the organic polymer alignment film will gradually decompose and lose its alignment properties if strong external light irradiates [28]. That will sharply degrade the imaging quality of LC-MLA fabricated by the friction method. However, ZnO is a non-polymer material with good and high stability, which has great potential to be applied in LC alignment [20,29]. In this study, we used a ZnO microstructure as the alignment layer to improve the LC-MLA performance.

The specific process of preparing ZnO LC-MLA is as follows. The structure of the ZnO LC-MLA was composed of two glass substrates, an ITO layer and an aluminum layer as electrodes, a ZnO alignment layer, and a nematic LC layer. Figure 3 presents the structure.

Fig. 3. Structural diagram of ZnO LC-MLA. (a) The structural diagram; (b) The top electrode layer adopts aluminum film and has 128 × 128 circular arrays. Their diameter is 128µm; (c) The SEM of the profile image for the ZnO LC-MLA; (d) On the glass layer, the fabricated microstructure was as the alignment layer.

Download Full Size | PDF

The critical preparation process of the device was the fabrication of the ZnO microstructure. ZnO is an optically transparent semiconductor material with excellent photoelectric properties. ZnO film can induce LC molecules through the surface tension of the interface, providing a new way to anchor LC molecules [30]. Compared with graphene as an alignment film, ZnO film has a high C-axis preferred orientation [31]. This study fabricated a ZnO microstructure by the electron beam evaporation method. The etched grid microstructure has been deposited with the ZnO film. The specific size of the microstructure is a 100µm interval, length of 20 mm, height of 300 nm, and top-line width of 20µm, as shown in Fig. 3 (d). The pretilt angle of LC molecules is about 2.7 °.

The nematic LC used is Merck E7. At room temperature of 20 ℃ and wavelength of 589 nm, the refractive index of LC E7 is n_e= 1.7472 and n_o= 1.5217. The nematic LC has filled into the LC cell by capillary effect. Other vital parameters of the ZnO LC-MLA were as follows. The driving signal frequency was 1kHz, the voltage value was from 0.0V_rms to 10.0V_rms, and the working voltage was between the ITO glass substrate and the aluminum layer. The thickness of the LC layer was 20µm. The spacer ball determined the thickness of the LC layer. The UV lithography and wet etching method fabricated the top electrode pattern of ITO. The diameter was 128µm. The spacing between arrays was a 32µm. The number of arrays was 128 × 128, as shown in Fig. 3 (b).

3. Experimental section

3.1 Experiment setup

Figure 1 shows the optical path of the proposed photoelectric hybrid network. The architecture consisted of an objective lens, a polarizer, a ZnO LC-MLA, a high-definition CCD, and a desktop computer, as shown in Fig. 4. The CCD was the MV-SUA500C-T of Huateng Weishi Co. The pixel size is 2.2µm × 2.2µm,5 × 10⁶pixel, 1/2.5 CMOS sensor, and 3180 × 3180 pixels. The light emitted from the scene passed through the objective lens, polarizer, and ZnO LC-MLA.

Fig. 4. Experimental setup. (a) The actual diagram of the optoelectronic hybrid neural network architecture based on ZnO LC-MLA; (b) The scene diagram of the filter wheel imaging spectrometer, and the right-down area was the transmittance diagram of the group of six filters.

Download Full Size | PDF

The classic filter wheel imaging spectrometer was selected as the ground truth reference. It was composed of an FWO-A1 filter wheel (Oeabt Co.) and a CCD. A total of 24 filters were in four groups, six in each. Figure 4 (b) shows one group of filters, 450 nm, 490 nm, 530 nm, 570 nm, 610 nm, and 650 nm. The chosen reconstruction algorithm for the filter wheel was a linear interpolation. In addition, we also selected the following control groups to compare, such as the proposed architecture without ZnO LC-MLA as a convolution layer (LC-MLA without), the conventional glass type with MLA as a convolution layer (MLA with), and the conventional glass type without MLA as a convolution layer (MLA without). The traditional MLA has the same parameters as the ZnO LC-MLA.

3.2 Deep learning training

The trained dataset for the optoelectronic hybrid network was the ARAD-HS. The training set contains various scenes, each having 512 × 482 pixels, a total of 301 bands from 400 nm to 700 nm through interpolation calculation. The interval between bands is 1 nm. The selected dataset covers several scenes. To facilitate training, we trimmed and reorganized the training patch with a size of 64 × 64 × 301. Finally, the total number of training patches we used was 210000. We used PyTorch for development and Adam as the optimizer. For the proposed network, the epoch number is 300, the batch size is 20, and the learning rate is 10⁻⁴.

The software and hardware environments of the network training were as follows, operating system Ubuntu18. The hardware configuration was an Intel Xeon E5-2660v3 CPU, 2.6 G,10 cores, 20 threads, 16 GB DDR4 ECC memory, and NVIDIA RTX 3060 GPU with 12GB RAM. The alternating optimization algorithm was used. The training time of the model was about 15 hours.

4. Results and discussion

4.1 ZnO microstructure alignment

The conventional alignment technology for LC-MLA is the friction method. The main issues of using PI film were static electricity and dust, which seriously affected the final imaging quality of the LC-MLA. Developing new alignment technology is an effective way to improve the performance of LC-MLA, which is also a research hotspot [32].

The observed image was the alignment of the LC molecules on the surface of the ZnO microstructure using a polarization optical microscope (POM). Figure 5 shows the polarizing micrograph of LC molecules on the surface of the ZnO microstructure. Experiments show that LC molecules uniformly align on the ZnO microstructure. The applied voltage was at 3.5V_rms, and the angle between the polarizer and the analyzer was 0°. Under this condition, the LC molecules present the same color in the field of view. When the angle between the polarizer and the analyzer was 45°, the image in the field of view became significantly darker. When the angle between the polarizer and the CCD was kept unchanged, the image in the field of view gradually altered as the voltage changed, indicating that the distribution of LC molecules also varied. It shows that the ZnO microstructure could effectively induce LC molecules to align along the horizontal direction.

Fig. 5. Experimental results of ZnO LC-MLA alignment. (a) Alignment mechanism; (b) POM results under different angles and applied voltages.

Download Full Size | PDF

The contact angle of deionized water and LC droplets on the ZnO microstructure measured by Data physics OCA25 are shown in Fig. 6. For deionized water and LC droplets, the contact angles were all less than 90°. In Fig. 6 (b), the LC dropped on the surface of the ZnO microstructure. When in equilibrium, the contact angle was 43.9°. Figure 6 (c) shows the contact angle of deionized water on the surface of the ZnO microstructure, and the result was 75.7°.

Fig. 6. Contact angle test. (a) Contact angle explained by three-phase interface principle; (b) Contact angle of LC droplets on ZnO microstructure; (c) Contact angle of deionized water on ZnO microstructure.

Download Full Size | PDF

The anchoring energy of the proposed ZnO microstructure reaches 1.858 × 10⁻⁴ j/m². Compared with the friction alignment method, the anchoring of the ZnO microstructure is strong. The van der Waals force formed between the ZnO microstructure and LC molecules. Berreman’s theory can explain the alignment mechanism for the conventional friction method on the anchoring of a nematic LC due to its elastic distortions induced by a sinusoidally grooved surface. The ZnO microstructure has a periodic grid structure, which Berremen’s theory can also explain. The microstructure also conforms to the Friedel-Creagh-Kmetz (FCK) law [33]. In addition, the surface of the ZnO microstructure is hydrophilic, and its contact angle is small than 90°, making it easy for long-chain LC molecules to form a horizontal alignment. With the combination of the above factors, the ZnO microstructure determines the alignment of LC molecules. It effectively induces the long axis of LC molecules to align along the horizontal direction, solving those issues caused by the conventional friction alignment method. As a result, the ZnO microstructure could significantly improve the performance of the proposed LC-MLA, which is a critical factor in improving the performance of the proposed photoelectric hybrid neural network.

4.2 ZnO LC-MLA photoelectric features

Figure 7 presents the transmittance data of ZnO LC-MLA from the 400 nm to 700 nm range. The transmittance function is presented in Eq. (5). From the measured result, the fabricated device has good transmittance and a high luminous flux in the visible light range.

Fig. 7. Experimental results of ZnO LC-MLA transmittance. (a) The light transmittance curve of ZnO LC-MLA from 300 nm to 900 nm under the voltage of 0V_rms, 2V_rms, 4V_rms, and 8V_rms; (b) The actual picture of ZnO LC-MLA.

Download Full Size | PDF

A scene was set 1.5 cm in front of the ZnO LC-MLA, illuminated by an indoor iodine tungsten lamp. The light field data analysis begins with the original raw data. The obtained Bayer images using the proposed ZnO LC-MLA at various voltages are shown in Fig. 8(a)-(c).

Fig. 8. Acquisition results of ZnO LC-MLA at different voltages (the Bayer images have been processed), which are (a) 0V_rms, (b) 4V_rms, and (c) 6V_rms. (d) is the all-in-focus information with the random three-voltage strategy.

Download Full Size | PDF

The distance between the main lens and the ZnO LC-MLA was 1.1 mm. Considering the birefringence of LC, the adjustable focusing characteristic is the primary advantage of ZnO LC-MLA. The focal length equation of the ZnO LC-MLA is $f = \frac{{r_{LC}^2}}{{2\Delta n \cdot {d_{LC}}}}$, where ${r_{LC}}$ represents the radius of the circular electrode on the top substrate, $\Delta n$ is the refractive index difference between the central region and the edge region of the LC layer, and ${d_{LC}}$ is the thickness of the LC layer. The USAF 1951 resolution plate was used to measure. With white light used as the light source, a spatial resolution of 32.0 lp/mm was selected on the resolution plate. When the applied voltage was at a specific value, the adjusting distance between the ZnO LC-MLA and the sensor was until the image of the resolution plate was clear again. With this subjective judgment method, the focal length of the ZnO LC-MLA at different applied voltages was recorded. The applied voltage was inversely proportional to the focal length. When the applied voltage was from 0.0V_rms to 10.0V_rms, the focal length ranged from 0.06 mm to 1.8 mm.

At 0.0V_rms, the ZnO LC-MLA was non-activation, as shown in Fig. 8(a). Then, the applied voltage of the ZnO LC-MLA was at 4.0V_rms. In this case, the ZnO LC-MLA was activated. A gradient refractive index array was formed in the LC layer. The CCD recorded the light field data with the direction information of incident light. The imaging result at 4.0V_rms was different from the state at 0.0V_rms. The Bayer image obtained by 6.0V_rms also had corresponding changes. The input of the end-to-end network was the all-in-focus information of the ZnO LC-MLA with the random three-voltage strategy via Eq. (1), as shown in Fig. 8(d). The dispersion was obvious in the local enlarged image, which can be expressed in Eq. (4).

4.3 Spectral reconstruction results with the proposed architecture

Firstly, we used the peak intensity detection method to calibrate the ZnO LC-MLA. We photographed a white surface under the irradiation of an iodine tungsten lamp and then detected the two peak responses of the iodine tungsten lamp at 546.5 nm and 611.6 nm. By comparing the peak response with the spectral data of the reference iodine tungsten lamp and correcting the measured value. In this way, we can obtain a result consistent with the spectral value of the reference iodine tungsten lamp.

The incoherent light source was the iodine tungsten lamp, and the light irradiated the Macbeth standard color card. The light field data were collected using the proposed ZnO LC-MLA. The input of the photoelectric hybrid neural network was the all-in-focus information by Eq. (1). The output result contained 301 spectral bands, with a spectral range of 400-700 nm and a central wavelength of 550 nm via the objective function Eq. (7) with the deep learning technology. During spectral reconstruction, the all-in-focus information was a constraint, which can improve the accuracy and robustness of reconstruction. To test the robustness of the network, the input added 30 dB of Gaussian noise. Figure 9 shows the results of the experiment. The filter wheel imaging spectrometer served as the ground truth reference. The MLA with caused significant ringing in the reconstructed spectral image. For the MLA without, the reconstructed result was still fuzzy to a certain extent. The LC-MLA without also had a fuzzy problem. Our proposed architecture could reconstruct the spectral curve more clearly than the classic filter wheel imaging spectrometer. Its spectral accuracy was the same as that of the traditional filter wheel imaging spectrometer. As seen in Fig. 9, our proposed architecture was suitable for spectral reconstruction and could obtain resolution enhancement spectral information.

Fig. 9. Macbeth color card comparison results. The control groups were used to compare. The ground truth used the filter wheel imaging spectrometer as the reference. It shows the PSNR and SSIM compared to the reference image. The presented sub-view of the reconstructed spectral light field was 5 × 5^th. The low-left area was the selected area of the reconstructed spectral curve. (a) - (f) are the shown spectral reconstruction curves on the low-right images, respectively.

Download Full Size | PDF

In order to quantitatively evaluate the effect of spectral reconstruction, two evaluation parameters, peak signal-to-noise ratio (PSNR) and structure similarity index measure (SSIM), were introduced. The subgraphs in Fig. 9 present the corresponding calculated values.

To further verify the advantages of our proposed architecture, we used an iodine tungsten lamp as an uncorrelated light source to illuminate the selected scene. The target objects in the actual scene were a teacup and a calendar. Figure 10 presents the PSNR of the reconstruction results in each spectral segment. The ground truth reference was obtained by the filter wheel imaging spectrometer. However, the MLA without had noticeable crosstalk between spectral bands, and the algorithm runs for a relatively long time. Thus, the efficiency is not high. For the MLA with, the reconstructed image was still blurred. Its image quality is significant degradation. It can be seen from the enlarged images, as shown in Fig. 10(c), that the introduction of the TV norm can effectively overcome the ill condition in the spectral image reconstruction, maintain a better image edge and avoid the over-smoothing effect.

Fig. 10. Actual scene comparison experiment results. The control groups were compared with our proposed architecture. The ground truth used the filter wheel imaging spectrometer as the reference. It shows the calculated PSNR and SSIM values relative to the reference image. The result was the 5 × 5^th sub-view of the reconstruction. (a) is the RGB schematic image of the actual scene; (b) For the spectral curve of the corresponding point of the green box, the reconstruction comparison was obtained by the filter wheel imaging spectrometer; (c) is a partially enlarged image of the comparison between the control groups and our proposed architecture.

Download Full Size | PDF

The control groups, LC-MLA without, MLA with, and MLA without, all had blurred reconstruction images to a certain extent. In contrast, the proposed architecture had a certain degree of improvement in spatial resolution. In addition, the proposed architecture was simple without dispersive optical devices, and its volume was significant reduction. The use of ZnO LC-MLA could solve the dimensional mismatch caused by the direct measurement. It could also solve the issues about the lack of SNR of a single spectral channel in the traditional imaging spectrometer and the longtime of all data acquisition. For the current mainstream deep learning network, our photoelectric hybrid neural network gave full play to the advantages of ZnO LC-MLA, obtained the all-in-focus information, and realized the multiplexing of the light field information. With the proposed photoelectric hybrid network, the reconstruction accuracy could reach 1 nm high accuracy, and the resolution could reach 1536 × 1536 pixel. The reconstruction speed had also improved because of reducing a conventional convolution layer. As seen from Fig. 10(b), the spectral curve reconstructed by our proposed architecture was consistent with that reconstructed by the filter wheel imaging spectrometer. As seen from the local enlarged image in Fig. 10(c), our proposed architecture could reconstruct the spectral image with clear texture, precise edge contour, and no apparent ringing effect compared with other methods.

4.4 Discussion

In the proposed architecture, the ZnO LC-MLA produced a phase delay to modulate the incident ray, and the CCD collected the output ray. Each electrically tunable had a focus. The integral information on CCD was only part of the information for the incident ray when the ray fell on the CCD. When conditions were satisfied, the convolution operation evolved into a dot product operation, as shown in Fig. 2. This operation was consistent with the physical process of the light intensity value obtained by the CCD. Therefore, obtaining the light intensity value on the CCD was an “optical” dot product.

Because the proposed architecture used the random three-voltage strategy to generate all-in-focus information, the spatial resolution of the final result was increased by about ten times compared with the existing light field camera [34].

Table 1 presents the comparison of the methods in the control groups. The proposed architecture effectively reconstructed hyperspectral data in the wavelength range of [400 nm, 700 nm]. Its accuracy reaches only 1 nm. This architecture has the advantages of simple equipment, low cost, and no need for additional dispersion and filtering devices.

Table 1. Comparison results of control groups^a

View Table

In particular, the convolution layer was realized according to the characteristics of the ZnO LC-MLA to improve the performance of the deep learning network, which is quite different from the deep learning network in the field of computer science. It can simplify the network’s structure, improve computational efficiency, and obtain high-quality reconstruction. From the table, the reconstruction time is much faster than pure algorithms, even the MLA type. That means adopting LC-MLA as a convolution layer is a significant factor in improving the performance of the optoelectronic hybrid network.

Figure 11 shows the curves between training and testing loss and duration of the whole photoelectric hybrid neural network. As the number of iterations increased, the loss of training and testing still slowly decreased. The training and testing began to converge after 15 iterations. The dashed line means the classic CNN (MST++) performance. As the proposed architecture with ZnO LC-MLA as a convolution layer, the proposed architecture had a relatively faster convergent speed during the training and test compared to the pure algorithms. The ZnO LC-MLA as a convolution layer could effectively enhance the performance of the proposed architecture.

Fig. 11. Convergence comparsion diagrams of the loss function. The solid line is our proposed method, and the dashed lines means the classic CNN (MST++). (a) The convergence diagram of the loss function during training; (b) The convergence diagram of the loss function during the test.

Download Full Size | PDF

5. Conclusions

This study proposed an optoelectronic hybrid neural network architecture and a prototype. This architecture provided resolution enhancement spectral reconstruction. The most advantage of the proposed architecture is the ZnO LC-MLA as a convolution layer. Since the ZnO microstructure was an alignment layer, which can effectively improve the performance of ZnO LC-MLA. After experimental verifications, the volume of the network decreased, and the speed of the network increased. The spatial resolution is up to 1536 × 1536 pixels and the reconstructed spectral resolution is 1 nm by constructing the TV-L1-L2 objective function and using mean square error as a loss function. The photoelectric hybrid neural network fully used the ZnO LC-MLA and had a good reconstruction. It is easy to build the actual architecture, and the architecture correction, containing spectral and geometric, is simple. However, this architecture also has some limitations. The reconstructed spectral range is still only in the visible region, and the non-reconstruction of spectral information is in the ultraviolet fluorescence and infrared spectral regions. There is still room for improvement in the proposed architecture in the future.

Funding

National Natural Science Foundation of China (51703071, 61771353); Natural Science Foundation of Hubei Province (2019CFB553); Knowledge Innovation Program of Wuhan-Basic Research (2022010801010350); New Generation Information Technology Innovation Project of China Ministry of Education (2020ITA05049); Hubei Provincial Key Laboratory of Intelligent Robot (HBIRL202101, HBIRL202203); The Engineering Research Center of Digital Imaging and Display, Ministry of Education, Soochow University (SDGC2134); Wuhan Institute of Technology (CX2022347).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. Malinen, A. Rissanen, H. Saari, P. Karioja, M. Karppinen, T. Aalto, and K. Tukkiniemi, “Advances in miniature spectrometer and sensor development,” Proc. SPIE 9101, 91010C (2014). [CrossRef]

2. P. Gatkine, S. Veilleux, Y. W. Hu, J. Bland-Hawthorn, and M. Dagenais, “Arrayed waveguide grating spectrometers for astronomical applications: new results,” Opt. Express 25(15), 17918–17935 (2017). [CrossRef]

3. R. M. Levenson and J. R. Mansfield, “Hyperspectral imaging in biology and medicine: slices of life,” Cytom. Part A 69(8), 748–758 (2006). [CrossRef]

4. R. M. Sullenberger, A. B. Milstein, Y. Rachlin, S. Kaushik, and C. M. Wynn, “Computational reconfigurable imaging spectrometer,” Opt. Express 25(25), 31960–31969 (2017). [CrossRef]

5. M. Descour and E. Dereniak, “Computed-tomography imaging spectrometer: experimental calibration and reconstruction results,” Appl. Opt. 34(22), 4817–4826 (1995). [CrossRef]

6. A. Wagadarikar, R. John, R. Willett, and D. Brady, “Single disperser design for coded aperture snapshot spectral imaging,” Appl. Opt. 47(10), B44–B51 (2008). [CrossRef]

7. D. J. Brady and M. E. Gehm, “Compressive imaging spectrometers using coded apertures,” Proc. SPIE 6246, 62460A (2006).

8. J. M. Mooney, V. E. Vickers, M. An, and A. K. Brodzik, “High-throughput hyperspectral infrared camera,” J. Opt. Soc. Am. A 14(11), 2951–2961 (1997). [CrossRef]

9. Z. Shi, C. Chen, Z. W. Xiong, D. Liu, and F. Wu, “HSCNN+: advanced CNN-based hyperspectral recovery from RGB images,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2018), pp. 10520–10528.

10. J. J. Li, C. X. Wu, R. Song, Y. S. Li, and F. Liu, “Adaptive Weighted Attention Network with Camera Spectral Sensitivity Prior for Spectral Reconstruction from RGB Images,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2020), pp. 1894–1903.

11. Y. H. Cai, J. Lin, Z. D. Lin, H. Q. Wang, Y. L. Zhang, H. Pfister, R. Timofte, and L. V. Gool, “MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2022), pp. 744–754.

12. X. Hua, Y. J. Wang, S. M. Wang, X. J. Zou, Y. Zhou, L. Li, F. Yan, X. Cao, S. M. Xiao, D. P. Tsai, J. C. Han, Z. L. Wang, and S. N. Zhu, “Ultra-compact snapshot spectral light-field imaging,” Nat. Commun. 13(1), 2732 (2022). [CrossRef]

13. W. Y. Zhang, H. Y. Song, X. He, L. Q. Huang, X. Y. Zhang, J. Y. Zheng, W. D. Shen, X. Hao, and X. Liu, “Deeply learned broadband encoding stochastic hyperspectral imaging,” Light: Sci. Appl. 10(1), 108 (2021). [CrossRef]

14. L.-L. Tian, F. Chu, W.-X. Zhao, L. Li, and Q.-H. Wang, “Fast responsive 2D/3D switchable display using a liquid crystal microlens array,” Opt. Lett. 46(23), 5870–5873 (2021). [CrossRef]

15. S. W. Kang and X. Y. Zhang, “Compound liquid crystal microlens array with convergent and divergent functions,” Appl. Opt. 55(12), 3333–3338 (2016). [CrossRef]

16. S. Xu, Y. Li, Y. F. Liu, J. Sun, H. W. Ren, and S. T. Wu, “Fast-response liquid crystal microlens,” Micromachines 5(2), 300–324 (2014). [CrossRef]

17. M. Levoy, “Light fields and computational imaging,” Computer 39(8), 46–55 (2006). [CrossRef]

18. Y. Lei, Q. Tong, X. Y. Zhang, H. S. Sang, A. Ji, and C. S. Xie, “An electrically tunable plenoptic camera using a liquid crystal microlens array,” Rev. Sci. Instrum. 86(5), 053101 (2015). [CrossRef]

19. M. Landy and J. A. Movshon, The plenoptic function and the elements of early vision (MIT Press, 1991).

20. Y. C. He, H. Li, W. T. Qian, and Y. T. Wu, “High-resolution light field imaging based on liquid crytal microlens arrays with ZnO microstructure orientation,” Opt. Lasers Eng. 162, 107424 (2023). [CrossRef]

21. C. L. Li, Z. G. Zang, C. Han, Z. P. Hu, X. S. Tang, J. Du, Y. X. Leng, and K. Sun, “Highly compact CsPbBr₃ perovskite thin films decorated by ZnO nanoparticles for enhanced random lasing,” Nano Energy 40, 195–202 (2017). [CrossRef]

22. H. X. Wang, S. L. Cao, B. Yang, H. Y. Li, M. Wang, X. F. Hu, K. Sun, and Z. G. Zang, “NH₄Cl-modified ZnO for high-performance CsPbIBr₂ perovskite solar cells via low-temperature process,” Sol. PRL 4(1), 1900363 (2019). [CrossRef]

23. H. X. Wang, P. F. Zhang, and Z. G. Zang, “High performance CsPbBr₃ quantum dots photodetectors by using zinc oxide nanorods arrays as an electron-transport layer,” Appl. Phys. Lett. 116(16), 162103 (2020). [CrossRef]

24. Z. G. Zang, “Efficiency enhancement of ZnO/Cu₂O solar cells with well oriented and micrometer grain sized Cu₂O films,” Appl. Phys. Lett. 112(4), 042106 (2018). [CrossRef]

25. M. Levoy and P. Hanrahan, “Light field rendering,” in 23rd Annual Conference on Computer Graphics and Interactive Techniques (1996), pp. 31–42.

26. C. Huang, X. P. Zhou, Y. D. Ouyang, X. S. Lin, Y. J. Wu, and Y. Huang, “Study on photoelectric dispersion characteristic of liquid crystal light valve,” Spectrosc. Spect. Anal. 26(3), 539–541 (2006).

27. R. Rajasekharan, C. Bay, J. Freeman, and T. D. Wilkinson, “Analysis of an array of micro lenses using Fourier-transform method,” IET Optoelectron. 4(5), 210–215 (2010). [CrossRef]

28. D. Y. Zhao, W. Huang, H. Cao, Y. D. Zheng, G. J. Wang, Z. Yang, and H. Yang, “Homeotropic alignment of nematic liquid crystals by a photo cross-linkable organic monomer containing dual photo functional groups,” J. Phys. Chem. B 113(10), 2961–2965 (2009). [CrossRef]

29. C.-C. Hsu, Y.-X. Chen, H.-W. Li, and J. Hsu, “Low switching voltage ZnO quantum dots doped polymer-dispersed liquid crystal film,” Opt. Express 24(7), 7063–7068 (2016). [CrossRef]

30. Y.-F. Chung, M.-Z. Chen, S.-H. Yang, and S.-C. Jeng, “Tunable surface wettability of ZnO nanoparticle arrays for controlling the alignment of liquid crystals,” ACS Appl. Mater. Interfaces 7(18), 9619–9624 (2015). [CrossRef]

31. Y. Kajikawa, “Texture development of non-epitaxial polycrystalline ZnO films,” J. Cryst. Growth 289(1), 387–394 (2006). [CrossRef]

32. M.-Z. Chen, W.-S. Chen, S.-C. Jeng, S.-H. Yang, and Y.-F. Chung, “Liquid crystal alignment on zinc oxide nanowire arrays for LCDs applications,” Opt. Express 21(24), 29277–29282 (2013). [CrossRef]

33. L. T. Creagh and A. R. Kmetz, “Mechanism of Surface Alignment in Nematic Liquid Crystals,” Mol. Cryst. Liq. Cryst. (1969-1991) 24(1-2), 59–68 (1973). [CrossRef]

34. N. Ren, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, “Light field photography with a hand-held plenoptic camera,” Stanford University Cstr (2005).

Method		Pixel number	Output data volume	Reconstruction time
MLA	Without	512 × 512	∼10⁸	38.5s
MLA	With	512 × 512	∼10⁸	32.3s
LC-MLA	Without	1536 × 1536	∼10⁹	21.4s
LC-MLA	With(Ours)	1536 × 1536	∼10⁹	16.5s

Photoelectric hybrid neural network based on ZnO nematic liquid crystal microlens array for hyperspectral imaging

Abstract

1. Introduction

2. Theory and principle

2.1 Spectral reconstruction theory via convex optimization

2.2 Hyperspectral reconstruction architecture

2.3 ZnO LC-MLA fabrication

3. Experimental section

3.1 Experiment setup

3.2 Deep learning training

4. Results and discussion

4.1 ZnO microstructure alignment

4.2 ZnO LC-MLA photoelectric features

4.3 Spectral reconstruction results with the proposed architecture

4.4 Discussion

5. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Tables (1)

Equations (10)

Optics Express