Depth extraction with offset pixels

W. J. Yun; Y. G. Kim; Y. M. Lee; J. Y. Lim; H. J. Kim; M. U. K. Khan; S. Chang; H. S. Park; C. M. Kyung

doi:10.1364/OE.26.015825

1. Introduction

Depth sensing with a compact system opens a wide variety of exciting applications, especially in the era of hand-held devices. Numerous applications such as facial expression recognition, hand gesture recognition, image-based scanning and augmented reality require small and fast depth extraction imaging systems. Porting such applications to mobile devices presents a great challenge due to power, size and speed constraints. A system that can estimate color as well as depth of a scene and is portable on mobile devices will allow development of numerous innovative and life-changing applications.

Numerous depth extraction schemes have been developed in the past; however, their feasibility especially for mobile devices is questionable. Structured light-based methods [1] illuminate the scene with a light pattern and analyze the deformation in the light pattern to estimate depth. Depth sensing systems based on time-of-flight (ToF) sensors [2] consider the time taken by a light source to be reflected back to estimate the depth of the scene. Both these schemes require active illuminants; thus, these schemes do not work in outdoor conditions. Furthermore, generally the depth extracted by ToF sensors is blurred with over-smoothed object boundaries. Stereo matching, which has its roots in binocular vision of animals, has been the most thoroughly investigated depth extraction scheme in literature [3]. However, stereo-based schemes require at least two camera sensors to observe the given scene; thereby, these schemes require excessive resources. Perhaps, it is due to the above reasons that depth-sensing cameras have not penetrated through the commercial markets despite rigorous efforts.

To limit the resources used for depth extraction, depth extraction systems with a single sensor have been proposed recently. These systems estimate depth with a single shot. [4] proposes Dual Aperture (DA) camera which is a system based on two co-centric apertures with different diameters and spectral characteristics. They allow IR to pass through a smaller aperture and RGB through a larger aperture. The difference of blur of the IR and RGB images is used to estimate depth of the scene. In [5], Offset Apertures (OA) for the IR and RGB images are proposed, where the disparity of the IR and RGB images is used to estimate depth. Although the IR image allows well-aligned RGB images, the IR signal can corrupt the RGB image. Furthermore, the computational requirement of [4] is very high as it requires estimating the blur difference through convolution with point spread functions of various sizes.

Recently, the concept of Offset Pixel Aperture (OPA) was presented in [6] and its manufacturing was discussed in [7]. Pixels are offset by having the metal layer opening at different locations for pairs of pixels. OPA retains the advantage of OA as it is based on disparity which can be more accurately estimated compared to the de-focus blur. Moreover, OPA does not use the IR signal; thereby, it avoids the RGB signals being corrupted by the IR signal. Avoiding cross-spectral disparity estimation has yet another advantage in that improved depth quality is observed. Another key advantage that the OPA camera carries compared to stereo cameras is that image rectification is not required as the pixels are well aligned. This not only removes the computations but also reduces the amount of memory required to perform rectification.

Despite all its advantages, there are numerous challenges to adopt the OPA sensor for real life applications These include lack of green pixels, severe defocus, shading and noise artifacts. Perhaps, it is due to these reasons that the results of [6] and [7] are only limited computer simulations and a point light source, respectively. In this paper, we present a compact depth-extraction system for the OPA sensor for practical utility. The key contributions of this paper are as follows.

We present the first demonstration of color image and depth extraction with the OPA principle. We propose a configuration of algorithms that can deal with the unique challenges presented by the OPA sensor. The performance of our method is validated through experimental results.
We propose a dedicated implementation of the depth extraction system. For this, we make rigorous efforts towards minimizing the hardware costs of the dedicated implementation. As a result, the proposed system is capable of depth extraction in real time with WUXGA-resolution videos.

OPA carries numerous advantages compared to competing technologies. Generally, using disparity provides better estimates of depth of a scene compared to using blur. Having multiple sensors and/or rectification increase cost. Also, IR corrupts the RGB image quality. OPA uses disparity for depth estimation, requires a single sensor, does not require rectification and does not use IR, making it a better candidate for depth estimation as summarized in Table I.

Table 1. Comparison of camera based depth extraction schemes

View Table | View all tables in this article

The rest of this paper is organized as follows. The OPA camera is discussed in Section 2. The depth extraction method with all the constituent stages using OPA is described in Section 3. Section 4 describes the proposed hardware implementation for fast depth extraction. The system evaluation through experiments is discussed in Section 5.

2. Offset pixel aperture camera

Unlike preceding single-shot depth estimation systems, the OPA camera has apertures offset at the pixel level. In detail, for every row the centers of the odd pixels apertures are offset from the centers of the even pixel apertures. This offset is across the pixels of the same channel only. This generates a depth-dependent disparity across pixels, which indicates the depth. A graphical illustration of the OPA camera is shown in Fig. 1. The pixel apertures are at different locations compared to the pixel centers. The OPA sensor is fabricated using 110 nm, 2 poly and 4 metal layers CMOS Image Sensor (CIS) technology. In the conventional image sensor, the metal layers are commonly used for wiring transistors and do not cover the photodiode for sensing light. In contrast, the OPA sensor utilizes the first metal layer to form the offset aperture by covering the photodiodes. The pixel pitch is 2.8 um, the size of the pixel aperture is 1.3 um × 2.8 um and the pixel apertures are offset by 0.75 um from the pixel centers. In this work, we used a lens system with a focal length of 6 mm and an F-number of 1.4.

Fig. 1 Graphical illustration of the design of OPA camera. The pixel apertures are at different locations relative to the pixel center.

Name	Description
Visualization 1	This video shows a demonstration of the proposed depth extraction system with OPA camera.
Visualization 1	This video shows a demonstration of the proposed depth extraction system with OPA camera.

Feature	Influence	Stereo [3]	DA [4]	OA [5]	OPA [7]

Depth principle	Accuracy	Disparity	Blur	Disparity	Disparity
No. of sensors	Cost	2	1	1	1
Rectification needed	Cost	Yes	No	No	No
IR used	Image quality	No	Yes	Yes	No

Block		Size[KBytes]

Pre-processing	Channel splitter	5 × 1
	Mean filter	4 × 4
	Local normalization	8 × 2

Stereo matching unit		32 × 2

Post-processing	Depth noise reduction	8 × 1, 5 × 1
Post-processing	Synchronizer	44 × 1

Total		158

Block		Gate count[K]

Depth transceiver		1.115
I2C slave		5.677
SPI slave		6.153
TPG, Synchronizer, Debouncer		40.152

Pre-processing	Channel splitter	36.31
	Mean filter	109.349
	Local normalization	89.616

Stereo matching unit	SAD	728.681
	SGM	42.169
	WTA	28.518

Post-processing	Depth noise reduction	68.260

Total		1156

Abstract

1. Introduction

2. Offset pixel aperture camera

3. Depth extraction using OPA camera

3.1. Pre-processing

3.2. Main processing

3.3. Post-processing

4. Dedicated implementation

5. Experimental results

5.1. Quantitative analysis

5.2. Color quality of OPA camera

5.3. Hardware implementation

5.4. Applications

6. Conclusion

Funding

References and links

Supplementary Material (2)

Cited By

Figures (18)

Tables (3)

Equations (8)

Optics Express