## Abstract

In this paper we present a new approach providing super resolved imaging at the center of the field of view and yet allowing seeing the remaining of the original field of view with the original resolution. This operation resembles optical zooming while the zoomed and the non zoomed images are obtained simultaneously. This is obtained by taking a single snap shot and using a single imaging lens. The technique utilizes a special static/still coding element and a post processing algorithmic, without any mechanical movements.

©2005 Optical Society of America

## 1. Introduction

Optical zooming is basically a super resolution technique since its purpose is to obtain resolution higher than provided by the imaging system (prior to zooming). The physical restrictions that limit the spatial resolution of an imaging system are either the size of aperture of the imaging lens, or the geometrical parameters of the detection array such as its pitch and fill factor. Eventually the hardest limitation prevails.

The common optical realization of optical zoom includes several lenses and a mechanical mechanism as in Ref. [1]. Other principles do not include mechanical movements but rather other time adaptive concepts allowing variation of the overall focal length of the lens. In Refs. [2–13] one may see an example of several works dealing with zooming lenses. Thus basically the zooming operation is actually the increase in the focal length of the imaging module providing smaller foot print of each pixel in the detector, on top of the object. The spatial resolution improvement in the center of the field of view during the zooming process is obtained since the foot print of each pixel on the object equals to ΔxR/F, where Δx is the pitch of the pixels of the detector, F the focal length and R the distance to the object. Thus, the regular optical zooming operation has two major disadvantages. The first one is that the increase of the focal length, for instance by a factor of 3, while preserving the F-number will result in increase in the volume of the imaging module by a factor of 3^{3}=27. It means more weight and less reliability (due to the mechanical mechanism). The second disadvantage is that the zoomed and the non-zoomed images are not obtained simultaneously and the resolution improvement in the central part of the field of view comes on the expense of decreasing the field of view.

In this paper we present a novel zooming approach in which instead of a several lenses only a single lens is used. In addition since no movement is required and the focal length is not changed, the imaging module volume is not increased. Note that the resolution improvement in the center of the field of view is not due to the increase of the focal length F but is rather due to generation of smaller effective pixels in that spatial region. That is reduction of Δx by the same factor in which the F should have been increased. Finally, the zoomed and non zoomed images are obtained simultaneously in a single snap shot. It should be noted that the resolution improvement obtained in the central part of the field of view follows the idea presented in Ref. [14]. However, the ideal of Ref. [14] shows how to obtain the resolution improvement but in this paper we show how to obtain this improvement without sacrificing the field of view, i.e. obtaining also the non-zoomed resolution in the remaining part of the field of view. Note that having an improved resolution in the central part of the field of view and simultaneously preserving the original non zoomed resolution in the outer parts yields more spatially resolved points than the number of the pixels in the detector array. Such an outcome is made possible by a trade-off payment in the dynamic range of the captured image.

The operation principle is based on the follows: the image resolution obtained using a common single lens is higher in the center of the field of view, and degrades towards the periphery. Usage of this property is essential for the proposed operation principle. This is because the surface where a perfect image is obtained is rather a sphere than a plane. The optical limit for the resolution obtained in the center is proportional to λF/D (where λ is the wavelength, F is the focal length and D is the aperture of the lens). For many detectors this resolution limit is much less restrictive and harder to reach in comparison to the restriction coming due to the sampling pitch of the detector. Consequently, in such cases the detector forces poorer image quality. In our technique the optics shall provide, in the center of the field of view, an optical resolution that is limited by the diffraction. In the remaining part of the field of view the optics shall provide a resolution limit which equals to the detector’s sampling pitch. In this manner by exploiting the aliasing effect due to the sampling of the detector, and performing some digital post processing results with a super resolved image. It will have a diffraction limited resolution at the center region of the field of view and yet preserving the original geometrical resolution at its outer parts.

In section 2 we present the theory of the suggested approach. In section 3 we present the experimental investigation and section 4 concludes the paper.

## 2. Theory

We will now derive the theory showing how we may obtain simultaneously the improved resolution in the central part of the field of view (zoomed image) while preserving the original non-zoomed resolution in the outer parts.

#### 2.1 Preliminary

For the sake of simplicity the analysis of the method will be one-dimensional (1-D). A two-dimensional deduction is straight-forward.

Let’s take a 1-D positive object s(x) (see Fig. 1). We denote by L_{T} its spatial support. This object has minimal resolution detail denoted by δx in its central L_{C} part. In the following mathematical analysis we will consider L_{C} to be 1/6 of L_{T}, although other ratio can be chosen. The finest optically resolved detail in the remaining periphery is three times larger - 3δx, that equals to the geometrical limitation of the pitch of the sampling detection array (see Fig. 1). This limitation is determined by the optics and exists prior to the digital sampling performed by the detection array.

One wishes to image this object using an ideal aberration-free optical system with magnification factor of 1. The image is captured using a camera with a pixel pitch of 3δx, while pixels are assumed to represent an ideal spatial Dirac impulse train. The proposed method enables resolving details with a high resolution in the central part, in spite of the larger pitch, without decrease in field of view. Using optical terms we obtain an optical zooming of X3 in the central 1/6 field of view and yet having simultaneously the X1 resolution (without zooming) in the other 5/6 field of view. All of this is obtained from a single optically coded and then digitally processed image. The penalty is introduction of some noise in the obtained image. The optical coding involves insertion of a certain spatial coding grating in the entrance pupil plane of the imaging lens. The super resolving approach that increases the resolution in the central 1/6 part of the field of view is based upon the approach presented in Ref. [14]. The investigated case here will deal with coherent illumination although extension into non coherent case is straight forward as described in Ref. [14].

Note that the geometrical super resolution method described in Ref. [14] is equivalent to the realization of an optical zoom in the central part of the field of view since the footprint seen, in the super resolved image, over the observed object equals to: (R/F)∙(Δx/κ) where R is the distance between the camera and the object, F is the focal length, Δx is the pitch of the pixels of the camera (Δx=3δ) and κ is the geometrical super resolution factor (we always discus the case of κ=3). In case that optical zoom of factor κ is performed the focal length is changed to κF and thus the footprint equals to (R/(κF)∙(Δx). It is easily seen that both expressions are identical. Thus, in Ref. [14] we have showed how without changing the focal length we perform optical zooming which is actually done by performing geometrical super resolution. However, the condition for the operation of the approach presented in Ref. [14] is that the input object occupies no more than 1/κ of the field of view. As previously mentioned, in this paper we will show how this field of view restriction is removed and one obtains simultaneously the super resolved image (the optical zoom image) in the center of the field of view and the original resolution image in its outer part.

#### 2.2 Mathematical general description

We denote by S(ν) the Fourier transform of the object s(x), with ν the spatial frequency coordinate belonging to the spectral range of ∊[-ν_{max}, ν_{max}], where ν_{max} is the maximal spatial frequency of the object. It is inversely related to the spatial resolution δx. We virtually divide the Fourier content into three equal regions:

- Left third S
_{-1}(ν) with ν ∊ [-ν_{max}, -1/3 ν_{max}] - Central third S
_{0}(ν) with ν ∊ [-1/3 ν_{max}, 1/3 ν_{max}] - Right third S
_{1}(ν) with ν ∊ [1/3 ν_{max}, ν_{max}].

The spatial grating multiplies these spectral components so that a certain degree of orthogonality between the components is created. The coding grating mask also consists of three regions:

- Left third G
_{-1}(ν) with ν ∊ [-ν_{max}, -1/3 ν_{max}] - Central third G
_{0}(ν) with ν ∊ [-1/3 ν_{max}, 1/3 ν_{max}] - Right third G
_{1}(ν) with ν ∊ [1/3 ν_{max}, ν_{max}].

The chosen mask fulfils the orthogonality condition of:

where δ[l,k] is Kronicker delta function. When the image is under-sampled by the detector -an aliasing effect takes place. In fact, the aliasing is essentially a folding of S_{-1}(ν) and S_{1}(ν) into a central third of the spectrum. Therefore, the spectrum of the captured image equals to:

To improve the clarity of this presentation let us now briefly recall the derivation made in Ref [14]. Let us examine a simple situation, in which we want to enhance the resolution by a factor of three. Assuming an ideal CCD in which the pixels are indefinitely small and are placed at a distance of Δx from one another (according to Fig. 1, Δx=3δx). We will show now that when one is willing to scarify 1/3 of the field of view he can obtain improvement of the resolution in that central 1/3 of the field of view by a factor of 3 (without increasing the focal length by a factor of 3). In case of ideal sampling the sampling function of the CCD [denoted as CCD(x)] is modeled as an infinite train of impulses:

As previously mentioned the coding mask [denoted as CDMÃ(ν)] is divided into three sub functions as follows:

The CDMA mask is multiplied in the Fourier plane with the spectrum of the input signal s(x) [denoted as S(ν)]. This is obtained since the coding mask is positioned in the coherent transfer function (CTF) plane of the imaging lens. In the coherent case in the CTF plane a Fourier of the imaged object is obtained. In the non coherent case this position is also related to the spectrum of the imaged object.

This spatial distribution is multiplied by CCD(x), the sampling grid of the CCD, which means that it is convolved with the Fourier of the CCD grid in the spectral domain:

Where * denotes convolution operation. Since Δν = 2π/Δx, the last expression can be simplified to:

$$\phantom{\rule{.7em}{0ex}}={\sum}_{n=-\infty}^{\infty}S\left(\nu -\mathrm{n\Delta \nu}\right)\left[{\sum}_{k=-1}^{1}{G}_{n}\left(\nu -\left(n+k\right)\mathrm{\Delta \nu}\right)\right]$$

Image retrieval, is simply achieved by Fourier transforming the grabbed output and multiplying it with the original coding mask and then downsampling:

$$\phantom{\rule{.2em}{0ex}}=\sum _{n=-\infty}^{\infty}S\left(\nu -\mathrm{n\Delta \nu}\right){G}_{n}\left(\nu -\mathrm{n\Delta \nu}\right)=S\left(\nu \right)\mathrm{CDM}\tilde{A}\left(\nu \right)\underset{\mathrm{downsampling}}{\to}S\left(\nu \right)$$

We see that modulating the input’s spectrum by its multiplying with the coding mask correctly prevents data corruption due to aliasing. This insight was proven in Ref. [14] and demonstrated experimentally. It indeed demonstrates super resolution, i.e., an effect equivalent to seeing an image with zoom of X3 without changing the focal length. But this improvement is obtained only in the central 1/3 of the field of view while the input object occupies only 1/3 of the field of view. Let us now continue with proving that we can obtain the super resolved image in the central field of view without the need of paying with the outer 2/3 of the field.

The grating of Eq. (1) is illustrated in Fig. 2 in a folded manner: G_{-1}(ν) and G_{1}(ν) are folded into a central third part of the spectrum: G_{0}(ν).

As a result I(ν) can be described as composed of so-called “macro-pixels”. Each macro-pixel consists of the S_{-1}(ν), S_{1}(ν) and S_{0}(ν) contributions [please see Fig. 3(a)– Fig. 3(c)]. The structure presented in Fig. 3(a) and 3(b) is the theoretical goal since it provides full and simple orthogonality condition. In reality, however, such binary-like coding grating will have finite number of harmonics. Therefore, the spectral structure of the “macro pixels” will be different. However, if properly designed, it will yet remain orthogonal (when proper locations are observed), and will resemble the structure showed in Fig. 3(c).

Next we formulate the reconstruction algorithm for the original image. The orthogonal coding grating mask is a Dammann like phase structure whose spatial effect is similar to replications. The mask is designed such that a different replication is generated for the high [G_{-1}(ν) and G_{1}(ν)] and low frequencies content [G_{0}(ν)] as seen in Fig. 4. The replications for the high frequencies are 1/6 field of view apart and for the low frequencies are 1/2 field of view apart.

- We shall first reconstruct the high frequency content S
_{-1}(ν) and S_{1}(ν) by sampling I(ν): The spatial contents of S_{-1}(ν) and S_{1}(ν) occupy only a fraction of the field of view L_{T}. Therefore it is possible to keep only each 6-th (L_{T}/L_{C}) sample without losing information. Other samples are calculated using interpolation. Figure 5(a) illustrates the sampling grid. Note that at the sampling points of S_{-1}(ν) and S_{1}(ν) are orthogonal. On other hand, there is a certain noise added to the sampled high frequency content due to the S_{0}(ν). In order to minimize this noise effect, each sample value is taken to be as algebraic average in its neighborhood. Figure 5(b) shows the Fourier transform of the grating illustrated in Fig. 5(a). As one may note, it resembles 7 delta functions: the two pairs of delta functions appearing on both sides of the central delta resemble spatial derivative since each one of those two pairs contain one positive and one negative delta while small spatial shift is introduced between them. Those two pairs that make the derivative correspond to the two replications (the -1 and the 1 orders) related to the high frequencies [Fig. 4(a)]. The outer two deltas correspond to the two replications (again the -1 and 1 orders) of the low frequencies [Fig. 4(b)]. - Next, we shall subtract the reconstructed S
_{-1}(ν) and S_{1}(ν) from I(ν). This shall leave us ideally with only low frequency content. It is expressed in the spatial domain as:$${i}_{L}\left(x\right)=({s}_{0}*{g}_{0})\left(x\right)\bullet \mathrm{rect}\left(\frac{x}{{L}_{T}}\right)$$where s

_{0}and g_{0}are the inverse Fourier transforms of S_{0}(ν) and G_{0}(ν), respectively and ‘*’ stands for convolution operation. rect (x/L_{T}) is defined as:$$\mathrm{rect}\left(\frac{x}{{L}_{T}}\right)=\{\begin{array}{c}1\mid x\mid \le \frac{{L}_{T}}{2}\\ 0\phantom{\rule{.5em}{0ex}}\mathrm{otherwise}\end{array}$$The g

_{0}(x) is in fact consists out of three Dirac impulse functions: - Now we divide each i
_{L}(x) and s_{0}(x) into sets of 6 equally-supported functions, denoted correspondingly as r_{j}(x) j=1,..,6 and f_{j}(x) j=1,..,6. These 2 sets of functions are related through 6 linear equations. Those equations can be well understood after observing Fig. 4(b):$${r}_{1}\left(x\right)={a}_{0}{f}_{1}\left(x\right)+{a}_{-1}{f}_{4}\left(x\right)$$

$${r}_{2}\left(x\right)={a}_{0}{f}_{2}\left(x\right)+{a}_{-1}{f}_{5}\left(x\right)$$

$${r}_{3}\left(x\right)={a}_{0}{f}_{3}\left(x\right)+{a}_{-1}{f}_{6}\left(x\right)$$

$${r}_{4}\left(x\right)={a}_{0}{f}_{4}\left(x\right)+{a}_{1}{f}_{1}\left(x\right)$$

$${r}_{5}\left(x\right)={a}_{0}{f}_{5}\left(x\right)+{a}_{1}{f}_{2}\left(x\right)$$$${r}_{6}\left(x\right)={a}_{0}{f}_{6}\left(x\right)+{a}_{1}{f}_{3}\left(x\right)$$or alternately through a 6×6 matrix:

$$\left[\begin{array}{c}{r}_{1}\left(x\right)\\ {r}_{2}\left(x\right)\\ {r}_{3}\left(x\right)\\ {r}_{4}\left(x\right)\\ {r}_{5}\left(x\right)\\ {r}_{6}\left(x\right)\end{array}\right]=\left[\begin{array}{cccccc}{a}_{0}& 0& 0& {a}_{-1}& 0& 0\\ 0& {a}_{0}& 0& 0& {a}_{-1}& 0\\ 0& 0& {a}_{0}& 0& 0& {a}_{-1}\\ {a}_{1}& 0& 0& {a}_{0}& 0& 0\\ 0& {a}_{1}& 0& 0& {a}_{0}& 0\\ 0& 0& {a}_{1}& 0& 0& {a}_{0}\end{array}\right]\phantom{\rule{.2em}{0ex}}\left[\begin{array}{c}{f}_{1}\left(x\right)\\ {f}_{2}\left(x\right)\\ {f}_{3}\left(x\right)\\ {f}_{4}\left(x\right)\\ {f}_{5}\left(x\right)\\ {f}_{6}\left(x\right)\end{array}\right]$$By inverting the matrix we find f

_{j}(x) and therefore s_{0}(x) - which is the low frequency content of the original image information. Note that f_{i}(x) are the original 6 spatial regions of s(x) while r_{i}(x) are the spatial distributions obtained in each of the 6 regions after generation of the replications on the CCD plane. Eqs. 11–12 correspond to the low frequency shift seen in Fig. 4(b). a_{i}are the coefficients with which each one of the 3 replication in Fig. 4(b) is multiplied.

## 3. Simulation investigation

In the experiment we assume that the test object is imaged with an optical imaging system having a resolution limit in the periphery equals to the detector’s array pitch. In the central part of the field of view the optical resolution is three times grater than in the periphery. In the simulations a Lena image is used as an object. A high frequency 2-D barcode is planted in the center of this image. This barcode is under-sampled if its every third pixel is taken into consideration. Therefore the central high frequency content of the image, that is the barcode pattern, is under-sampled or low-pass filtered by a detector. To adapt the notations of Fig. 1, the resolution of Lena image is 3δx while the resolution of the barcode pattern is δx. A grating element (the coding mask) was attached to the imaging lens (the CTF plane or the entrance pupil of the lens) as depicted in the experimental setup of Fig. 6(a). The grating contained a different Dammann grating (see Ref [15]) in the central and outer parts of the mask as described in Fig. 4. The mask itself is illustrated in Fig. 6(b).

The three regions of the grating depicted in Fig. 3(c) are merely a shifted cosine functions. In this arrangement the high frequency content is sampled at 1/6 of the basic sampling rate, since the spatial extent of the S_{-1}(ν) and S_{1}(ν) is L_{C} = L_{T}/6. The Fourier transform of the grating is merely several impulse functions that in the spatial domain generate the 6 shifted replicas of the object, as shown in Fig. 5(b). After a recovering the high frequency content, one should solve a set of 6 linear equations (see Eqs. 11 or 12) in order to reconstruct the low frequency content S_{0}(ν). Figure 7(a) presents the non-zoomed image in which the full field of view is seen. In this case, though, the central high resolution barcode structure can not be resolved [see Fig. 7(a)]. In Fig. 7(b) we performed regular optical zooming to the image of Fig. 7(a). Here the field of view is reduced by a factor of 3 but the spatial resolution is improved by the same factor and now the central barcode structure can be resolved.

In the final stage we applied our post processing algorithm on the captured image. The resulted image is shown in Fig. 8. This image proves the concept presented in this paper: The high frequency central field of view (X3 optical zoom) is retrieved along with the non-zoomed remaining field of view. Obviously the 6×6 spatial blocks seen on the reconstructed image in Fig. 8 can be removed by proper image processing and enhancement that was not applied on the obtained image.

The approach was tested also with other input images and produced similar outcome. It is important to note that in this manuscript we show only the proof of principle for the suggested approach. The noise reduction algorithmic and the investigation of this concept under non coherent and polychromatic illumination are yet to be presented in future work.

## 4. Conclusions

In this paper we have presented a new approach for obtaining optical zooming in which no moving elements are required and a single lens with special coding mask is used. The main advantage of the proposed approach is that here the zoomed central field of view and the non zoomed full field of view are obtained simultaneously. This yields effectively a number of spatial pixels which exceed the number of the pixels in the detector array. The described outcome is obtained by attaching a special coding grating to the imaging lens and applying appropriate digital post processing algorithm.

## References and links

**
1
. **
R. B.
Johnson
and
C.
Feng
, “
Mechanically compensated zoom lenses with a single moving element
,”
Appl. Opt.
**
31
**
,
2274
–
2280
(
1992
). [CrossRef] [PubMed]

**
2
. **
E. C.
Tam
, “
Smart electro optical zoom lens
,”
Opt. Let.
**
17
**
,
369
–
371
(
1992
). [CrossRef]

**
3
. **
H.
Tsuchida
,
N.
Aoki
,
K.
Hyakumura
, and
K.
Yamamoto
, “
Design of zoom lens systems that use gradient-index materials
,”
Appl. Opt.
**
31
**
,
2279
–
2286
(
1992
). [CrossRef] [PubMed]

**
4
. **
R. J.
Pegis
and
W. G.
Peck
, “
First-order design theory for linearly compensated zoom systems
,”
J. Opt. Soc. Am.
**
52
**
,
905
–
911
(
1962
). [CrossRef]

**
5
. **
G.
Wooters
and
E. W.
Silvertooth
, “
Optically Compensated Zoom Lens
,”
JOSA
,
**
55
**
,
347
–
355
(
1965
). [CrossRef]

**
6
. **
T.
ChunKan
, “
Design of zoom system by the varifocal differential equation. I
,”
Appl. Opt.
**
31
**
,
2265
–
2273
(
1992
). [CrossRef] [PubMed]

**
7
. **
Y.
Ito
, “
Complicated pin-and-slot mechanism for a zoom lens
,”
Appl. Opt.
**
18
**
,
750
–
758
(
1979
).

**
8
. **
D. R.
Shafer
, “
Zoom null lens
,”
Applied Optics
,
**
18
**
,
3863
–
3870
(
1979
). [PubMed]

**
9
. **
K.
Tanaka
, “
Paraxial analysis of mechanically compensated zoom lenses. 1: Four-component type
,”
Appl. Opt.
**
21
**
,
2174
–
2181
(
1982
). [CrossRef] [PubMed]

**
10
. **
D. Y.
Zhang
,
N.
Justis
, and
Y. H.
Lo
, “
Integrated fluidic adaptive zoom lens
,”
Opt. Let.
,
**
29
**
,
2855
–
2857
(
2004
). [CrossRef]

**
11
. **
A.
Walter
, “
Zoom lens and computer algebra
,”
J. Opt. Soc. Am. A
,
**
16
**
,
198
–
204
(
1999
). [CrossRef]

**
12
. **
M. N.
Akram
and
M. H.
Asghar
, “
Step-zoom dual-field-of -view infrared telescope
,”
Appl. Opt.
**
42
**
,
2312
–
2316
(
2003
). [CrossRef] [PubMed]

**
13
. **
A.
Walther
, “
Angle eikonals for a perfect zoom system
,”
J. Opt. Soc. Am. A
,
**
18
**
,
1968
–
1971
(
2001
). [CrossRef]

**
14
. **
J.
Solomon
,
Z.
Zalevsky
, and
D.
Mendlovic
, “
Geometrical super resolution by code division multiplexing
,”
Appl. Opt.
**
44
**
,
32
–
40
(
2005
). [CrossRef] [PubMed]

**
15
. **
H.
Dammann
and
E.
Klotz
, “
Coherent optical generation and inspection of two-dimensional periodic structures
,”
Opt. Acta
**
24
**
,
505
–
515
(
1977
). [CrossRef]