Sequential RGB color imaging with a&#x00A0;millimeter-scale monochrome camera with a rolling shutter

Jordan Anspach; David L. Dickensheets

doi:10.1364/AO.488272

1. INTRODUCTION

The most common method for obtaining color images with a digital camera employs a Bayer color mask, which reduces the effective resolution relative to a comparable monochrome camera without the Bayer mask. When using compact cameras with low pixel counts, this reduction of resolution may be significant.

Ultracompact cameras are increasingly used in applications with strict size constraints, such as video endoscopes, where the camera is packaged at the distal tip of the instrument [1]. Other applications include capsule endoscopy [2], surgical robotics, AR/VR headsets [3], and wearables [4]. Our interest is the use of a miniature camera embedded in a larger optical system to provide an adjunct image. A millimeter-scale camera may be introduced in or near to the beam path of the larger instrument without significant obscuration of the primary optical beam. Using an embedded camera with its own lens gives the designer flexibility to choose the aperture and focal length for the host instrument and for the embedded camera independently to achieve potentially quite different imaging goals.

An example of this type of application is a high-NA dermatology microscope [5]. This device uses scanning laser reflection confocal microscopy (RCM) to examine a patient’s skin and helps to diagnose skin cancer without requiring a biopsy. A challenge for use in vivo is the small field of view (diameter on the order of 500 µm) and the large contact area of the objective lens with the skin (diameter on the order of 10 mm). During use, the objective lens obscures the clinician’s view of the lesion under examination. It is difficult to locate where microscopic imaging is occurring within the context of the lesion. An adjunct camera can provide a wide-field image of the skin, in order to guide the microscopy procedure. The concept is shown in Fig. 1, where the embedded wide-field (WF) camera is approximately 1 mm wide and 1.43 mm tall. This mm-scale camera introduces minimal obscuration of the high-NA microscopy beam, yet it delivers a large field-of-view image at the skin surface.

Fig. 1. Cross section of dermatology microscope tip with annular beam for RCM in blue and the field of view for the wide-field imaging in orange. In (a) the miniature camera [6] shows the skin surface with large field of view of several mm, while the arrow shows the region being imaged by RCM; (b) RCM image. Reprinted with permission from [5].

Download Full Size | PDF

The pixel count for some mm-scale cameras is quite low, limiting the image quality. Our experiments used the NanEye2D camera [6], with ${{250}} \times {{250}}$ pixels. Performing color imaging using a Bayer color mask compromises this low resolution even further. This was the motivation for comparing the performance of the color version of this camera with a Bayer mask to the monochrome version of this camera without a Bayer mask but using three-color sequential illumination to generate a color image.

Several techniques have been explored to create color images with a gray-scale sensor, such as a spinning color filter wheel for sequential RGB filtering [7], tunable filters [8,9], or a tricolor prism with three separate sensors for the RGB fields. Sequential lighting is also an option [10,11] and has been used in other cases such as spectral imaging in biological sensing applications [12]. As an example, BioVision Technologies has produced a commercial device that can create color images using sequential RGB illumination in combination with a monochrome sensor mounted on a microscope but at much larger scale than our application [13]. Systems using sequential illumination with small LEDs or optical fiber to deliver the light can readily be miniaturized.

This paper reports the implementation of a sequential RGB illumination system with a millimeter-scale monochrome image sensor. The phrase “sequential RGB” describes an illumination system that is synchronized with the camera and only shines one color at a time (red, green, or blue) using RGB light-emitting diodes (LEDs). Then, because the camera is monochrome, each pixel is utilized for each color with no interpolation. Three sequential images are assigned to their respective color channels and merged to create a full color image. This method increases the spatial resolution at the cost of video frame rate being one-third of the original. To minimize the loss of perceived frame rate, the color video will be updated with each captured frame instead of waiting for three full frames to be taken and reconstructed [14]. Another consideration for our application is that, compared with other techniques used to create color images using a monochrome sensor, the sequential RGB illumination method allows for a more compact design.

Several super-resolution techniques exist, which could possibly be used in tandem with our color reconstruction algorithm to further enhance the color video, though the real-time implementation may be a limitation for the current technology [15–18]. It is interesting that sequential illumination has also been used to increase temporal resolution at the cost of spatial and spectral information by dedicating the RGB Bayer channels to interleaved colored illumination pulses, as done in [19].

Our solution to improve spatial resolution using real-time sequential RGB illumination for color video with a miniature commercial camera-on-chip is unique due to the rolling shutter and color “mixing,” which is allowed to occur so as to maximize frame rate and exposure time simultaneously. We employ a field programmable gate array (FPGA) to synchronize the illumination with the camera’s frame rate. A real-time algorithm accommodates the rolling shutter of the camera to separate the mixed-color exposures into three single-color images for real-time RGB color display. Our experiments measure the modulation transfer function (MTF) using Bayer and sequential RGB color schemes; we compare the results to theoretical expectations accounting for several aspects of the imaging system, including finite pixel size, diffraction limited lenses, and Bayer interpolation. We discuss the observed improvement in resolution possible with the sequential color scheme as well as some of the consequences of implementing this scheme. As far as we know, this paper is the first to describe a sequential illumination scheme for color imaging using a monochrome camera with a rolling shutter, which monitors the low-voltage different signal (LVDS) serial data link in order to synchronize the illumination to the free-running camera. We include as supplemental materials our MATLAB camera interface code and FPGA code so that others can experiment using their own LVDS-based cameras (Code 1, Ref. [20]).

2. MATERIALS AND METHODS

A. Components

The camera we evaluated is the NanEye 2D from ams AG, which includes NanEye Viewer software for displaying video and a USB interface called NanoUSB2. This camera has a square format with total physical dimensions of ${{1}}\;{\rm{mm}} \times {{1}}\;{\rm{mm}} \times {1.43}\;{\rm{mm}}$, with an attached four-wire ribbon cable for power, ground, and two LVDS lines. The active area is ${0.75}\;{\rm{mm}} \times {0.75}\;{\rm{mm}}$ and has a resolution of ${{250}} \times {{250}}$ with 3 µm square pixels. The monochrome and Bayer color cameras we evaluated had identical lenses with $f\#$ 4.0, 120° field of view (diagonal, in air) and 0.50 mm focal length. The NanEye Viewer software displays the video feed in a simple graphical user interface and has various capture options. This code is available in MATLAB, C#, and C++; we used the MATLAB code and modified it for this specific use. The frame rate varies from 43–62 fps, depending largely on the supply voltage to the camera. For these experiments, 2.1 V was supplied to the camera, which corresponds to a frame rate around 56 fps.

In addition to the NanoUSB2 interface, which came with the camera and deserializes the image data for display, an FPGA development board was added to synchronize and power the RGB LEDs. The Intel Cyclone 10 LP evaluation board was chosen to monitor the LVDS signal from the camera to detect the start of each frame, control the illumination system, and signal to the host computer which LEDs are lit. The LEDs are the ultracompact APGF0606VGTPBTSEETC from Kingbright measuring only 0.65 mm across each side with RGB peak wavelengths of 632, 518, and 462 nm, respectively. Each color is switched on/off via a simple transistor driver circuit. Figure 2 outlines the block diagram for the complete system.

Fig. 2. System block diagram.

Download Full Size | PDF

B. MTF Measurement Method

The MTF is widely used as a metric to assess camera performance relative to its ability to resolve fine details necessary for feature recognition. The MTF plots the modulation depth, or contrast, in the image as a function of the spatial frequency in cycles/pixel of a sinusoidal intensity pattern. The MTF will be limited by the performance of the lens due to optical diffraction effects and depends on the $f$-number and aberrations such as defocus. It will also be limited by pixel size. The MTF is further degraded by use of a Bayer color mask and the process of interpolation to fill in the missing colors. It is this third effect that we are addressing with our sequential illumination scheme. The overall impact on MTF is multiplicative with diffraction, finite pixel size, and interpolation all contributing to the overall MTF response.

Here, we discuss the impact of using a Bayer color separation scheme on the camera’s MTF. The layout of a Bayer mask is shown in Fig. 3 and demonstrates how each pixel is only exposed to light from a filtered band of the visible spectrum. Then, using post-processing, the final image is created with an interpolation process called “de-Bayering,” which uses the surrounding pixels to estimate each color at the pixels where it was not directly measured.

Fig. 3. Bayer mask arrangement of RGB filters.

Download Full Size | PDF

Fig. 4. Effect of Bayer interpolation on the sensor MTF. The light black line is the luminance-weighted diffraction limit for our lens; the dashed black line shows filtering due to finite pixel size; and the dashed-dotted heavy black line is the corresponding luminance-weighted monochrome response. The magenta dashed–dotted line is the luminance-weighted effect of Bayer interpolation, and the heavy magenta curve shows the complete Bayer MTF response.

Download Full Size | PDF

Because each individual color is sampled more sparsely than that for an equivalent monochrome sensor (assuming equal pixel sizes between the two sensors), the MTF for each color channel will be degraded compared to a monochrome image. Furthermore, the green channel MTF will be affected differently than the red and blue since green is sampled with twice the spatial density of pixels. The theory describing an estimate of this degradation is described in [21]. That theory assumes nearest-neighbor interpolation (which is used in our system). It also combines the three-color MTF curves into a “luminance MTF” using a weighted average of the MTF for each color channel with the weighting given in ISO 12233. This generated MTF curve based on the overall luminance can be compared with the MTF, which might be measured with a monochrome sensor. The luminance weights used in all MTF calculations and measurements are 60% green, 30% red, and 10% blue. With this luminance weighting, the 1D MTF degradation due solely to the linear Bayer interpolation is found to be ${{\rm MTF}_{\text{interpolation}}} = 0.65 + 0.35\cos (2\pi f)$, where $f$ is the spatial frequency in cycles/pixel [21].

Figure 4 shows the theoretical luminance MTF for a monochrome camera and Bayer color camera, calculated for the camera specifications given in Section 2.A. The figure illustrates contributions due to the diffraction limited lens, the finite pixel size, and de-Bayer interpolation [22]. The dashed-dotted heavy black line shows the expected response for the monochrome sensor. It is a point-by-point product of the MTF effects due to diffraction and finite pixel size. The heavy magenta line is for the Bayer color sensor and includes lens diffraction, finite pixel size, and interpolation effects. For the diffraction effects, we have made use of a measured magnification $M = - {0.11}$, and an effective $f$-number for finite-conjugate imaging of $\tilde F\# = (1 - M)F\# = 4.4$ using the manufacturer-provided $F\# = {4.0}$. For midvisible wavelength $\lambda = 0.55\;{{\unicode{x00B5}{\rm m}}}$ and ${3.0}\;\unicode{x00B5}{\rm m}$ pixel spacing, the diffraction limited cutoff frequency at the sensor is

\frac{1}{{\lambda \tilde F\#}} = 413\;{\rm{lppm}},

corresponding to 1.24 cycles/pixel. Note that this lens can pass spatial frequencies higher than the Nyquist sampling frequency of 0.5 cycles/pixel, implying that aliasing artifacts are a potential problem.

When measuring the MTF spatial frequency response, we employed the slanted edge technique, following the guidelines of ISO-12233. We used the open-source MATLAB code sfrmat4 written by P. Burns [23], along with the plugin SE MTF 2xNyquist written by C. Mitja for ImageJ [24], for occasional comparison and validation.

C. Sequential RGB Frames

In our experiments, we used the manufacturer’s FPGA-based camera interface and user-interface software to acquire and display streaming video and to capture image frames. For Bayer color imaging, no custom modification was necessary. In order to generate sequential RGB frames, we augmented the manufacturer’s camera interface with our own FPGA-based synchronization circuit and used the manufacturer’s software development kit to modify the image acquisition and display software in order to display streaming video and single frames acquired using sequential RGB illumination.

Fig. 5. Row readout operation of a camera with rolling shutter.

Download Full Size | PDF

For a camera with a global shutter, sequential illumination may be implemented by illuminating the scene with the RGB LEDs in sequence, timed to coincide with the global shutter so that each frame in the video stream contains data for a single color channel. The NanEye cameras employ a rolling shutter, meaning that some rows are integrating signals while other rows are in reset during the line-by-line readout. The rolling shutter is illustrated in Fig. 5. The exposure time is controlled by the number of lines on the sensor in “reset” after being read out; the rest are in integration. Thus, if the illumination color is switched for each frame, there will be a mix of two colors in every frame, regardless of the timing of the color change.

One solution to this problem would be to illuminate with a single color for two consecutive frames, to allow the full integration process to finish for every line in the image. The effective frame rate for a full three-color image would then be reduced by six, compared with the Bayer single-frame solution. A second approach is to switch color at the beginning of each frame, after the last row has been read out, and treat each frame as a linear superposition of two colors as illustrated in Fig. 6. The pure RGB frame data may be recovered from a set of three mixed-color frames. We use the second method, doing the linear transformation from mixed color frames to sequential RGB frames in real time for display of the video stream. In this way, the full three-color image only requires three consecutive frames, rather than six.

Fig. 6. Simplified representation of the ideal color mixing for each frame as defined by the RGB sequence and the individual colored LEDs turning on/off at the start of each new frame.

Download Full Size | PDF

Fig. 7. Images taken during the calibration process before real-time run mode. Each image is titled with the LED action initiated at the beginning of the frame’s integration.

Download Full Size | PDF

We assume that for each pixel ($m$, $n$), the RGB values can be calculated as shown in Eq. (1):

(1)$$\begin{split}&\left[{\begin{array}{*{20}{c}}{{\rm Red}(m,n)}\\{{\rm Green}(m,n)}\\{{\rm Blue}(m,n)}\end{array}} \right] = \left[{\begin{array}{*{20}{c}}{{a_{\text{red}}}(m,n)}&{{b_{\text{red}}}(m,n)}&{{c_{\text{red}}}(m,n)}\\{{a_{\text{green}}}(m,n)}&{{b_{\text{green}}}(m,n)}&{{c_{\text{green}}}(m,n)}\\{{a_{\text{blue}}}(m,n)}&{{b_{\text{blue}}}(m,n)}&{{c_{\text{blue}}}(m,n)}\end{array}} \right]\\&\qquad\qquad\qquad\qquad\times\left[{\begin{array}{*{20}{c}}{A(m,n)}\\{B(m,n)}\\{C(m,n)}\end{array}} \right], \\ &{\rm{dropping}}\,{\rm{the}}\,{\rm{pixel}}\,{\rm{reference}}\,\left({m,n} \right):\,\left[{\begin{array}{*{20}{c}}{{\rm Red}}\\{{\rm Green}}\\{{\rm Blue}}\end{array}} \right] = M\left[{\begin{array}{*{20}{c}}A\\B\\C\end{array}} \right].\end{split}$$

Three frames are acquired ($A$, $B$, and $C$), each with their own mix of red, green, and blue (due to the rolling shutter of the camera). To keep the mix the same for each transition, the lighting is synchronized with the camera’s frame rate, so that a color switch always occurs just after the last row is read out. We use the illumination sequence red (R), green (G), blue (B) synchronized to the beginning of frames $A$, $B$ and $C$, respectively. We therefore expect that image $A$ will contain a mixture of blue and red, image $B$ will have a mix of red and green, and image $C$ will have a mix of green and blue. To measure this “mix” at each pixel, a calibration step will be completed before initiating real-time color video mode.

The terms inside the ${{3}} \times {{3}}$ matrix $M$ in Eq. (1) are initially unknown and represent the weighted contribution of each frame to each color for this pixel. For example, ${a_{\text{red}}}$ is the weight of this pixel’s value, which can be attributed to the red LED being illuminated during the first frame $A$ of the sequence. These terms are a function of the timing of the LED illumination in relation to the rolling shutter of the camera, the number of rows in reset, and the brightness and physical position of each LED. The variable ${\rm Red}$ corresponds to the value to display in the red channel of the final color image (the object reflectivity for that range of wavelengths). This equation is for a single pixel and needs to be computed for each pixel in the image.

The goal is to determine the ${{3}} \times {{3}}$ matrix in Eq. (1) ahead of time so that only a simple multiplication step will be required during the real-time video run mode. This is possible because these values account for the static characteristics of each LED color in the illumination system and the integration time (number of rows in reset), while the measured matrices $A$, $B$, and $C$ contain the color information that describes the scene and its changes for each frame. To find these static coefficient values, a calibration step captures frames imaging a white reference target, which is treated as reflecting 100% of the light, no matter which color is lit. Calibration is done one color at a time, with each color being turned on at the beginning of the first frame, turned off at the beginning of the second frame, and left off for the third and final frame. Sample output images from this process are shown in Fig. 7. The four bright spots are direct reflections from the LEDs and are treated as artifacts to be ignored.

Each image in Fig. 7 is labeled with the frame letter corresponding to the sequence order shown in Fig. 6, with a subscript indicating which LED is active. For example, ${A_{\text{RGB}}} = {A_{100}}$ refers to frame $A$ (acquired in sequence after turning off blue and illuminating red) but with only the red LED active (green and blue set to 0), while ${A_{010}}$ refers to frame $A$ acquired with only the green LED active. These nine images are used to create the calibration equations

\left[{\begin{array}{*{20}{c}}1\\0\\0\end{array}} \right] = M\left[{\begin{array}{*{20}{c}}{{A_{100}}}\\{{B_{100}}}\\{{C_{100}}}\end{array}} \right],\left[{\begin{array}{*{20}{c}}0\\1\\0\end{array}} \right] = M\left[{\begin{array}{*{20}{c}}{{A_{010}}}\\{{B_{010}}}\\{{C_{010}}}\end{array}} \right],\left[{\begin{array}{*{20}{c}}0\\0\\1\end{array}} \right] = M\left[{\begin{array}{*{20}{c}}{{A_{001}}}\\{{B_{001}}}\\{{C_{001}}}\end{array}} \right],

leading to the solution for $M$ according to

M = {\left[{\begin{array}{*{20}{c}}{{A_{100}}}&{{A_{010}}}&{{A_{001}}}\\{{B_{100}}}&{{B_{010}}}&{{B_{001}}}\\{{C_{100}}}&{{C_{010}}}&{{C_{001}}}\end{array}} \right]^{- 1}} = \left[{\begin{array}{*{20}{c}}{{a_{\text{red}}}}&{{b_{\text{red}}}}&{{c_{\text{red}}}}\\{{a_{\text{green}}}}&{{b_{\text{green}}}}&{{c_{\text{green}}}}\\{{a_{\text{blue}}}}&{{b_{\text{blue}}}}&{{c_{\text{blue}}}}\end{array}} \right].

The coefficient matrix $M$ must be computed for each pixel. This is done once in the calibration stage when speed is not a concern.

Fig. 8. Experimental setup to compare performance of the Bayer camera against the monochrome camera and sequential RGB imaging.

Download Full Size | PDF

Once calibration is completed, all the information is ready for the normal run mode. During this stage, three frames are taken in, $A$, $B$, and $C$, which are each a mix of colors (one turning on, one turning off, and one staying off). The calibration has measured this mix, so during run mode, it is only necessary to multiply each of the three frames by its respective weight for each color. In our MATLAB implementation, the incoming ${{250}} \times {{250}}$ pixel images are reshaped so that they are $62500 \times 1$ matrices and then concatenated into a single ${{62500}} \times {{3}}$ matrix. Then, the following element-wise multiplication can be performed (specified using the MATLAB .* operator). The variable $p$ is defined as the total number of pixels ($m\;*\;n$):

\begin{split}&\left[{\begin{array}{*{20}{c}}{A(0)}&{B(0)}&{C(0)}\\ \vdots & \vdots & \vdots \\{A(p)}&{B(p)}&{C(p)}\end{array}} \right].*\left[{\begin{array}{*{20}{c}}{{a_{\text{red}}}(0)}&{{b_{\text{red}}}(0)}&{{c_{\text{red}}}(0)}\\ \vdots & \vdots & \vdots \\{{a_{\text{red}}}(p)}&{{b_{\text{red}}}(p)}&{{c_{\text{red}}}(p)}\end{array}} \right]\\ &= \left[{\begin{array}{*{20}{c}}{A(0)*{a_{\text{red}}}(0)}&{B(0)*{b_{\text{red}}}(0)}&{C(0)*{c_{\text{red}}}(0)}\\ \vdots & \vdots & \vdots \\{A(p)*{a_{\text{red}}}(p)}&{B(p)*{b_{\text{red}}}(p)}&{C(p)*{c_{\text{red}}}(p)}\end{array}} \right]\end{split}.

Finally, a sum across the rows is computed to obtain the single-color frame:

\begin{array}{*{20}{c}} {\rm Red}(0,0)= {\rm Red}(0)=A(0)*{{a}_{\rm red}}(0)+B(0)*{{b}_{\rm red}}(0)+C(0)*{{c}_{\rm red}}(0), \\ \vdots \\ {\rm Red}(m,n)={\rm Red}(p)=A(p)*{{a}_{\rm red}}(p)+B(p)*{{b}_{\rm red}}(p)+C(p)*{{c}_{\rm red}}(p).\end{array}

This gives the red value for each pixel in the final color image. The process must then be repeated for the other two colors. Once the ${{62500}} \times {{1}}$ matrices for each color are calculated, they can be reshaped into ${{250}} \times {{250}}$ matrices and concatenated to produce an RGB image for display. This process was repeated as each new $A$, $B$, or $C$ frame is taken in, so that the displayed color video has the same update framerate as the camera itself (nominally 56 fps). However, it still takes three sequential frames for all color channels to be computed, so there is a time latency associated with this procedure.

D. Experimental Setup

To measure the MTF of each imaging system, the Bayer and monochrome cameras were set up to have identical fields of view (FOVs), as shown in Fig. 8. Each camera has a 120° FOV, $f/{{4}}$ lens with a best focus of 5–8 mm. The height of each camera is variable in this setup, along with the height and angle of the target. Overhead room lighting was turned off for all images and measurements. A slanted edge printed onto a paper target (Applied Image, Inc.) was imaged in the vertical and horizontal directions with a slant angle of 5°–10°. The region of interest (ROI) was chosen to encompass the center of the image, while excluding any hotspots from the illumination system. ROIs were chosen to be at least 64 pixels across in both directions. An edge fit order of 2 was chosen to account for the image distortion due to the built-in lenses [25]. The lighting and gain of the camera was adjusted for each image so that no saturation was occurring.

3. RESULTS

A. Sample Images

The dermatology microscope system being developed in this work will primarily rely on human interpretation of the displayed color video. Thus, the appearance “to the eye” is important as well as the more quantitative measures such as MTF. Figure 9 shows an RGB bar target printed on standard paper imaged at a 5 mm distance comparing the image from the Bayer color camera to the image from the monochrome camera with sequential RGB lighting. The frames $A$, $B$, and $C$ are shown, each with a mix of two colored LEDs, as explained above, for the sequential RGB method. See Visualization 1 for a video clip generated in real-time by our sequential RGB software.

Fig. 9. Raw sequential RGB images (top) and full color images using the Bayer mask camera (left bottom) and reconstructed via the sequential RGB (right bottom) system.

Download Full Size | PDF

Fig. 10. Comparative images of USAF three-bar targets taken with each of the three imaging methods investigated in this work. Each zoomed-in portion of the images have had brightness increased by 30% to help with comparison.

Download Full Size | PDF

Fig. 11. Vertical slanted edges used for MTF calculation for each imaging method.

Download Full Size | PDF

Figure 10 shows a group of USAF three-bar targets imaged at a 5 mm distance with the Bayer camera, the monochrome camera with simultaneous illumination of the RBG LEDs to produce white light, and a color image using the sequential RGB method with the monochrome camera. This shows the clear difference in resolution between the two cameras, especially at higher spatial frequencies. Additionally, the sequential RGB method does not noticeably lose any resolution compared with the monochrome camera.

Figure 11 contains images of a vertical slanted edge taken with the Bayer camera, monochrome camera with white light, and sequential RGB method with monochrome camera. These subimages correspond to the ROI chosen for the MTF measurements illustrated in Figs. 12 and 13.

Fig. 12. Measured luminance-weighted MTF for Bayer color camera (solid magenta curve), monochrome camera (dashed–dotted black curve), and sequential RGB using monochrome camera (dashed cyan curve). Theoretical responses are shown in the light dotted curves.

Download Full Size | PDF

Fig. 13. Measured MTF curves for the individual color channels from the Bayer camera (solid lines) and sequential RGB camera (dashed lines).

Download Full Size | PDF

Fig. 14. Separated color channels of the Bayer and sequential RGB for the vertical slanted edges shown in Fig. 11. The computed and normalized edge spread functions are shown along the bottom row for both the Bayer (dashed gray curve) and sequential RGB (solid black curve) for each color channel.

Download Full Size | PDF

We should point out that no work was done here to adjust the color balance for the sequential RGB system. There are differences between the Bayer filter spectral shapes and the LED emission spectra. In addition to recovering RGB information from the recorded image frames, further processing could be employed to better balance hue perception, but no such processing was employed for this study. In addition to difference in hue perception, the images acquired using sequential illumination, as shown in Figs. 11 and 14, appear to show more intensity fluctuation noise, despite integration time for Bayer and sequential RGB image acquisition being the same. This may result from the additional image processing required for sequential RGB frame recovery, which is sensitive to noise in the saved calibration images shown in Fig. 7 as well as noise in the frames acquired during real-time imaging. Noise of this type in linear unmixing processes with background subtraction is also discussed in [12].

Figure 14 shows two sets of images with identical sizes, comparing the RGB frames from the Bayer color camera and the RGB frames computed for the sequential illumination scheme using the monochrome camera. There is apparent degradation in resolution for the Bayer camera’s red and blue channels compared with the green channel, with no corresponding difference in resolution between the RGB frames computed from the sequential illumination images.

B. Comparative MTF Curves

Figure 12 shows the measured and theoretical luminance-weighted MTF curves for the Bayer camera, monochrome camera with white light and sequential RGB system. Note that the theoretical curve for the sequential RGB method is identical to that of the monochrome camera. This shows that, for the measured and theoretical cases, the monochrome camera outperforms the Bayer mask. The sequential RGB method also surpasses the Bayer mask and is similar to the MTF of the monochrome sensor.

Figure 13 separates the Bayer and sequential RGB measurements into each color channel. By plotting this comparison, we see that the green channel of the Bayer camera has higher MTF values than the red and blue channels, which are degraded due to their lower spatial sampling rates. By comparison, the sequential illumination RGB images show higher MTF values for all color channels. Furthermore, there are slightly higher MTF values for the shortest blue wavelengths and slightly lower MTF values for the longer red wavelengths, which is consistent with the lens-limited diffraction effect.

4. DISCUSSION

A. Resolution

Using the sequential RGB method does improve the resolution of the real-time color video displayed to the user. The measured MTF50 (spatial frequency resulting in 50% luminance MTF) increased from 0.12 cycles/pixel to 0.16 cycles/pixel. A larger difference is seen at higher spatial frequencies. For example, the measured MTF20 (spatial frequency resulting in 20% luminance MTF) increased from 0.23 cycles/pixel to 0.36 cycles/pixel, a 57% increase. The improvement is observable by eye in the USAF three-bar targets shown in Fig. 10 as well. It is also apparent in the slanted edge images separated by color, shown in Fig. 14, where the red and blue color channels of the Bayer image are noticeably more pixelated.

B. System Impacts and Further Work

While the sequential RGB method can produce higher spatial frequency response, there are other important consequences. Because three consecutive images are needed to create a single color image, there is a latency that can cause motion artifacts [10], though there are methods to reduce this effect [26,27]. These artifacts can be quite distracting to the human observer, as shown in Visualization 1. For our implementation, we employed a synchronization FPGA circuit that monitored the frame transfer information to synchronize the LED illumination to the camera readout cycle, extracting this from the serial LVDS data stream. While not a fundamental limitation, this implementation was prone to occasional loss of synchrony and misidentification of the $A$, $B$, and $C$ frames, leading to the “color switching” also seen in Visualization 1. This occurs when the signal from the camera is misread from either FPGA, which leads to duplicate images or switches the order of colors during the reconstruction process. The current LVDS signal-splitting method could be replaced with a single LVDS FPGA receiver programmed to accomplish all of the synchronization and frame processing for RGB color extraction in the FPGA hardware rather than on the host PC. Such a scheme would eliminate the color-switching artifacts and open up the possibility for more sophisticated on-the-fly image processing as well.

C. Measurement Limitations

Several factors may have impacted the MTF measurements. Direct reflections from the fixed LED positions limited the ROI size; this in turn limited the possible edge response oversampling factor. Also, it was difficult to control camera alignment precisely, to ensure that the cameras were pointing down perfectly normal to the targets. This was accommodated by tilting the target or the camera on their optical mounts, but alignment was by eye, and no further adjustments were made. The distance between camera and target was, however, accurately measured using the fine adjustment screw on the $z$-axis translation optical mount. Other factors such as the electrical characteristics of the image sensors and inherent optical aberrations may have also contributed to the differences between the theoretical and measured MTF curves. In particular, there is likely a systematic defocus error affecting the Bayer and monochrome images. This may have impacted all MTF measurements, but it should not have biased the comparison in favor of one camera over the other. Finally, in this study we did not compare color performance of the two methods.

5. CONCLUSION

We have created an imaging system that generates and displays real-time color video utilizing a monochrome sensor with a rolling shutter and synchronized sequential RGB LED illumination. Using this system, we measured and compared the spatial frequency response of the sequential RGB method to a Bayer image sensor using the same sensor size and lens. The sequential RGB method does improve the resolution of images compared to an identical camera with a Bayer mask. Increases in resolution were especially apparent for higher spatial frequencies, with the measured MTF20 (spatial frequency resulting in 20% luminance MTF response) increasing by 57% relative to the Bayer response. As pixel sizes shrink and pixel numbers increase for this mm-scale class of cameras, the reduced resolution penalty of the Bayer mask may become less significant and must be weighed against latency and complexity costs of the sequential RGB method. However, for the miniature cameras with limited pixel number ($250 \times 250$) and relatively large 3 µm pixels that we tested, the improvement in spatial frequency response was easily measured in the MTF and appreciable by eye in sample images. We conclude that color imaging using a rolling-shutter monochrome sensor with sequential RGB illumination can be done in near real-time and with superior spatial resolution to a Bayer color version of the same sensor.

Funding

National Institute of Biomedical Imaging and Bioengineering (R01-EB028752-04).

Acknowledgment

The authors acknowledge S. Orser and E. Himes for their contributions to the electronic systems and software used to collect the data presented in this paper.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

1. A. Boese, C. Wex, R. Croner, U. B. Liehr, J. J. Wendler, J. Weigt, T. Walles, U. Vorwerk, C. H. Lohmann, M. Friebe, and A. Illanes, “Endoscopic imaging technology today,” Diagnostics 12, 1262 (2022). [CrossRef]

2. T. Nakamura and A. Terano, “Capsule endoscopy: past, present, and future,” J. Gastroenterol. 43, 93–99 (2008). [CrossRef]

3. C. Liu, A. Berkovich, S. Chen, H. Reyserhove, S. S. Sarwar, and T.-H. Tsai, “Intelligent vision systems—bringing human-machine interface to AR/VR,” in IEEE International Electron Devices Meeting (IEDM) (2019), pp. 10.5.1–10.5.4.

4. K. W. Lin, T. K. Lau, C. M. Cheuk, and Y. Liu, “A wearable stereo vision system for visually impaired,” in IEEE International Conference on Mechatronics and Automation (2012), pp. 1423–1428.

5. D. L. Dickensheets, S. Kreitinger, G. Peterson, M. Heger, and M. Rajadhyaksha, “Wide-field imaging combined with confocal microscopy using a miniature f/5 camera integrated within a high NA objective lens,” Opt. Lett. 42, 1241–1244 (2017). [CrossRef]

6. “CMOS micro camera module,” https://ams.com/en/cmos-micro-camera-module.

7. E. G. Nassimbene, “Color video record and playback system,” U.S. patent US3529080A (15 September 1970).

8. G. D. Sharp and K. M. Johnson, “New RGB tunable filter technology,” Proc. SPIE 2650, 98–105 (1996). [CrossRef]

9. A. Raz and D. Mendlovic, “Sequential filtering for color image acquisition,” Opt. Express 22, 26878–26883 (2014). [CrossRef]

10. F. Xiao, J. M. DiCarlo, P. B. Catrysse, and B. A. Wandell, “Image analysis using modulated light sources,” Proc. SPIE 4306, 22–30 (2001). [CrossRef]

11. J. I. Shipp and J. L. Goodell, “Single sensor video imaging system and method using sequential color object illumination,” U.S. patent US5264925A (23 November 1993).

12. T. Zimmermann, J. Rietdorf, and R. Pepperkok, “Spectral imaging and its applications in live cell microscopy,” FEBS Lett. 546, 87–92 (2003). [CrossRef]

13. “ScopeLED light sources,” BioVision Technologies, https://www.biovis.com/light_scopeled.html.

14. R. Johnston, C. Lee, and C. Melville, “Methods and systems for creating sequential color images,” U.S. patent US20060226231A1 (12 October 2006).

15. E. Shechtman, Y. Caspi, and M. Irani, “Space-time super-resolution,” IEEE Trans. Pattern Anal. Mach. Intell. 27, 531–545 (2005). [CrossRef]

16. S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE Signal Process. Mag. 20(3), 21–36 (2003). [CrossRef]

17. M. M. Islam, V. K. Asari, M. N. Islam, and M. A. Karim, “Super-resolution enhancement technique for low resolution video,” IEEE Trans. Consum. Electron. 56, 919–924 (2010). [CrossRef]

18. N. R. Shah and A. Zakhor, “Resolution enhancement of color video sequences,” IEEE Trans. Image Process. 8, 879–885 (1999). [CrossRef]

19. C. Jaques, E. Pignat, S. Calinon, and M. Liebling, “Temporal super-resolution microscopy using a hue-encoded shutter,” Biomed. Opt. Express 10, 4727–4741 (2019). [CrossRef]

20. J. Anspach, “Sequential-RGB-illumination,” figshare, 2023, https://doi.org/10.6084/m9.figshare.22129361.

21. E. Yotam, P. Ephi, and Y. Ami, “MTF for Bayer pattern color detector,” Proc. SPIE 6567, 65671M (2007). [CrossRef]

22. G. D. Boreman, Modulation Transfer Function in Optical and Electro-Optical Systems (SPIE, 2001).

23. “Spatial frequency response (SFR): sfrmat,” Burns Digital Imaging, http://burnsdigitalimaging.com/software/sfrmat/.

24. “Slanted Edge MTF,” https://imagej.nih.gov/ij/plugins/se-mtf/index.html.

25. P. D. Burns and D. Williams, “Camera resolution and distortion: advanced edge fitting,” in Proc. IS&T Int’l. Symp. on Electronic Imaging: Image Quality and System Performance XV (2018), pp. 171-1–171-5.

26. N. Ohyama, E. Badiqué, M. Yachida, J. Tsujiuchi, and T. Honda, “Compensation of motion blur in CCD color endoscope images,” Appl. Opt. 26, 909–912 (1987). [CrossRef]

27. A. A. Volfman, D. Mendlovic, and A. Raz, “Pentagraph image fusion scheme for motion blur prevention using multiple monochromatic images,” Appl. Opt. 55, 3096–3103 (2016). [CrossRef]

Name	Description
Code 1	Matlab code for camera API and FPGA code for synchronizing LED illumination
Visualization 1	Video capture during real-time display with sequential RGB illumination and frame processing.

Sequential RGB color imaging with a millimeter-scale monochrome camera with a rolling shutter

Abstract

1. INTRODUCTION

2. MATERIALS AND METHODS

A. Components

B. MTF Measurement Method

C. Sequential RGB Frames

D. Experimental Setup

3. RESULTS

A. Sample Images

B. Comparative MTF Curves

4. DISCUSSION

A. Resolution

B. System Impacts and Further Work

C. Measurement Limitations

5. CONCLUSION

Funding

Acknowledgment

Disclosures

Data availability

REFERENCES

Supplementary Material (2)

Data availability

Cited By

Figures (14)

Equations (6)

Applied Optics