Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Innovative K-Means based machine learning method for determination of non-uniform image coordinate system in panoramic imaging: a case study with Ladybug2 camera

Open Access Open Access

Abstract

Currently, the practical implementations of panoramic cameras range from vehicle navigation to space studies due to their 360-degree imaging capability in particular. In this variety of uses, it is possible to calculate three-dimensional coordinates from a panoramic image, especially using the Direct Linear Transformation (DLT) method. There are several types of omnidirectional cameras which can be classified mainly as central and non-central cameras for 360-degree imaging. The central omnidirectional cameras are those which satisfy the single-viewpoint characteristic. Multi-camera systems are usually developed for applications for which two-image stereo vision is not flexible enough to capture the environment surrounding a moving platform. Although the technology based on multi-view geometry is inexpensive, accessible, and highly customizable, multi-camera panoramic imaging systems pose a difficulty in obtaining a single projection center for the cameras. In this study, not only a defining method of the non-uniform image coordinate system is suggested by means of the K-Means algorithm for a single panoramic image, captured with a Ladybug2 panoramic camera in the panoramic calibration room but also the use of an elliptical panoramic projection coordinate system definition by Singular Value Decomposition (SVD) method in panoramic view. The results of the suggested method have been compared with the DLT algorithm for a single panoramic image which defined a conventional photogrammetric image coordinate system.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Capturing a single image in the surrounding environment, a multi-camera system offers great potential for geomatics instrumentation, robotics, car navigation, entertainment systems, and even space applications (such as spacecraft docking navigation systems). In parallel with the increasing importance of 3D information, the proliferating significance and use of machine vision studies as a means of obtaining three-dimensional information, have also increased.

The first panoramic camera was invented by P. Puchberger of Austria in 1843. It was a hand crank-driven swing lens panoramic camera, capable of capturing a 150° image. The rotating camera invention of M. Garella of England in 1857 extended the field of view of capture to a full 360° [1]. However, photogrammetry was unable to benefit from the use of early panoramic cameras. This was attributed to rotational mechanics’ issues in image acquisition, along with the complexity and time-consuming nature of image modeling during image correction [2].

The use of panoramic cameras in the field of photogrammetry has increased even more in recent years, thanks to the continuous advancements in technology. An omnidirectional system with multi-cameras can be designed with a conventional geometric optical system (e.g. perspective lenses) and a non-conventional geometric optical system (e.g. fisheye lenses). Conventional cameras have a narrow FOV, which need a larger number to cover a full-spherical FOV. Polydioptric systems with fisheye lenses have been the primary choice in the field due to their large FOV [3]. The main contribution of [4] is to express a multi-camera system and then to derive the structure from motion constraint equations.

Currently, there are many applications that are made with multiple camera systems. For example, a study conducted by [5] shows that a multi-camera system with six cameras, which is mainly used for land applications, is used in underwater work. [6] suggests and evaluates a low-cost and lightweight Personal Mobile Terrestrial System with a versatile camera. [7] provides panoramic images collected employing a mobile vehicle and an automatic mutual information recording method for the mobile LiDAR. In their study, the panoramic images are obtained from the Ladybug3 camera, which consists of six fisheye lenses. [8] focuses on the problem of calculating the external calibration of the 3D Velodyne LiDAR according to a solidly connected camera while at the same time estimating its internal parameters. To solve this problem, it is required to divide the problem into two least squares sub-problems and solve each one analytically to determine an exact initial estimate for the unknown parameters. The purpose of the study conducted by [9] is to develop an absolute visual positioning system of an omni-wheeled robot for indoor navigation. In the field of photogrammetry, [10] investigates the potential of immersive videography by strategically arranged multiple cameras, with each camera facing a specific angle to create captivating videos. In a study conducted by [11], a conical stereo imaging system with six and twelve sensors is presented. Additionally, it also provides a mathematical model of this system. [12] investigates the feasibility of using a six-sensor omnidirectional/fisheye camera system and reports about its performance. In the [13] study, to create a stereo panoramic application, several panoramic images were taken from two distinct locations in the calibration room, and photogrammetric three-dimensional coordinates were calculated for the identified targets. [14] introduces a dataset developed to enable the design of local and global energy-aware navigation and path-planning algorithms for planetary environments. Sensors onboard included an Occam Vision Group omnidirectional stereo camera which was composed of 10 individual RGB cameras. [15] presents a drone detection system that uses an omnidirectional camera. It is aimed to improve the performance of small object detection, which is generally regarded as a challenge for object detection using CNN. A UAV equipped with a multi-camera imaging system can capture oblique images from almost all angles of view. To highlight the extensive potential of 360° imaging for photogrammetric measurements, some studies have focused on camera calibration and photogrammetric applications of these multi-camera systems based on fisheye lenses [3].

Considering that these systems are composed of multiple cameras, to enable the use of a single set of exterior orientation parameters (EOPs) for the component images and generate composite images, it is necessary to estimate the relative orientation parameters (ROPs). In these recent commercial systems, cameras are arranged in the same compact and closed structure, which hinders direct ROP measurements and thus requires indirect estimation. The calibration of spherical cameras requires that object points are evenly distributed around the camera (to produce suitable geometry) and also that they are accurately measured in the images. Therefore, a proper 360° camera calibration field is required. The extraction of accurate image point coordinates is a critical process in camera calibration [16].

The fundamental image coordinate system in photogrammetry, which are measured the 2D image coordinates of target signals, is the coordinate system that adopts the point of the image center as a datum. This datum point has to be used for the projection center and the principal point is named.

Clustering the image measurements in photogrammetry by means of several clustering methods is possible [17]. In the literature, the term “K-Means” was first developed by James MacQueen in 1967 where he states that “K-Means clustering is one of the most commonly used unsupervised machine learning algorithms for partitioning a given data set into a set of k groups” [18]. K-Means is a non-hierarchical data clustering method that attempts to partition existing data into one or more clusters/groups. This method partitions the data into clusters/groups so that the data that have the same characteristics are grouped into the same cluster and the data that have different characteristics are grouped into other groups. The purpose of data clustering is to minimize the objective function set in the clustering process, which generally attempts to minimize variation within a cluster and maximize the variation between clusters. Due to its advantages of simplicity of its implementation, processing quick, efficiency, versatility, strong scalability, and easiness of interpreting the results of clustering this algorithm has become the most well-known and commonly used algorithm in various research fields [19,20]. The distance formula is used to calculate the distance from each data object to the center point, so as to realize the classification of the dataset. The main disadvantage of the K-Means method is that the number k is often not known priorly [21]. In the literature, there are several studies that have explored the use of eigenvalues to classify LiDAR points. It is proposed the use of metrics based on eigenvalues and the K-Means method to carry out the classification in [22]. The libraries of R are shown to be effective in remote sensing data processing tasks, such as classification using K-Means clustering and computing the Normalized Difference Vegetation Index (NDVI) [23]. K-Means generate hyper-elliptical (i.e. elliptical over more than two dimensions), unconstrained clusters offering the benefit of fast processing and a constrained number of clusters. However, the method requires the number of clusters to be specified beforehand, which limits its usefulness in data mining and often means that the technique results in clusters that meet the “required answer” [24].

By redefining the image coordinate system, the accuracy of the DLT on the panoramic image and the accuracy of the positions that can be obtained as a result of the DLT can be improved. In this study, an elliptical panoramic coordinate system is defined in a three-dimensional 360-degree calibration area with the Ladybug2 camera, and a non-uniform image coordinate system is obtained on the panoramic image using the K-Means segmentation method. With the obtained non-uniform coordinate system, DLT parameters and three-dimensional control points are compared with the DLT results obtained from the conventional image coordinate system. To determine the datum of the image coordinate system in panoramic imaging, the single panoramic image is obtained in the middle of the calibration room and photogrammetric EOPs are calculated with the DLT algorithm. However, the EOP parameters which are calculated from image coordinates in a conventional image coordinate system with a DLT algorithm have high residuals for the coordinates of 3D control points depending on the panoramic projection lines at the single panoramic image projection. This problem could be solved by defining the non-uniform image coordinate system which is calculated through segmentation of the 3D coordinates in panoramic projection for a single image in the 360-degree calibration room view. For this purpose, the K-Means segmentation method has been used for defining the non-uniform image coordinate system for the panoramic Ladybug2 camera projection model. Detailed information is presented in the application section. For example, in accordance with the suggested method, the point of the object to be studied photogrammetrically is obtained from a single panoramic view. Thus, image coordinates which are measured in the photogrammetric image coordinate system and in the suggested non-uniform coordinate system should be used for the calculation of the control points in a three-dimensional object coordinate system. These residuals, parameters, and accuracies calculated by DLT are compared with each other and reported accordingly. The fundamental of the suggested process in this study is presented in a flowchart in Fig. 1.

 figure: Fig. 1.

Fig. 1. Flowchart of the study.

Download Full Size | PDF

In the second section of this paper, some information about the geometric model of projection of panoramic images with coordinate systems are given. The third section focuses on the application of the research. This section describes the K-Means application and the definition of a non-uniform image coordinate system for the camera with camera parameters according to the panoramic projection model which uses a Ladybug2 panoramic camera. The fourth section describes the results of the experiment and includes a discussion of its analysis. The last section is the conclusion section which focuses on providing precise comments on the study.

2. Coordinate systems in panoramic imaging

Panoramic modeling is designed to provide users with a full 360-degree landscape viewing and observation. A spherical panoramic structure is a variation of the sphere model. In this model, the observation point is located in the center of the sphere [25]. The coordinate systems used in the panoramic cameras exceed the Cartesian object coordinate system and the image coordinate system. It is also necessary to define a supporting coordinate system (ellipsoidal, cylindrical, spherical, conical, etc.) that is set to define the specific camera projection geometry. Figure 2 shows the general transformation steps including a panoramic coordinate system.

 figure: Fig. 2.

Fig. 2. The coordinate systems.

Download Full Size | PDF

Figure 3 presents transformation geometry which includesng an image coordinate system between these auxiliary coordinate systems for this study.

 figure: Fig. 3.

Fig. 3. Structure of coordinate systems in panoramic imaging.

Download Full Size | PDF

The coordinates of the panoramic projection system are calculated from the control points whose coordinates are known in the 3D calibration room which was setup in 2009 [26]. Set panoramic image length is a, the spherical model of radius is R, after the image is mapped, the relationship between the R and a should be 2πR, so the aspect ratio of the panoramic image should be 2:1. Set pixel coordinates of panoramic images are p'(u, v); corresponding 3D point coordinates on the panoramic model are P'(X, Y, Z); the mapping relationship should be according to the literature [27]. The correspondence relationship between the two-dimensional point p’ and the spherical surface of three-dimensional point P’ can be obtained from image coordinates by the derivation of pixel coordinates. In this case, conversion between the two systems is provided by the Eq. (1,2 and 3) [28].

$$\begin{aligned} X &= \frac{1}{2}R\sqrt {{\mu ^2} + {v^2} - {\mu ^2}{v^2} - 1} \cos \qquad - \pi \le \le \pi \\ Y &= \frac{1}{2}R\sqrt {{\mu ^2} + {v^2} - {\mu ^2}{v^2} - 1} \sin \qquad 1 \le \mu \\ \; Z &={-} \frac{1}{2}R\mu v\; \; \; \; - 1 \le v \le 1\qquad R \ge 0\; and\; costant \end{aligned}$$
where the angle of $\varPhi $ is the radian value.
$$\begin{aligned} r_1^2 = {x^2} + {y^2} + {\left( {z + \frac{R}{2}} \right)^2}\\ r_2^2 = {x^2} + {y^2} + {\left( {z - \frac{R}{2}} \right)^2} \end{aligned}$$
$$\mu = \frac{{{r_1} + {r_2}}}{2}\qquad \quad v = \frac{{{r_1} - {r_2}}}{2}$$
where ${r_1}$ and ${r_2}$ are the elliptic radii.

3. Application

In this application, the Ladybug2 360-degree panoramic camera which is 2048 × 1024 pixel resolution within 0.00465 mm pixel size has been used for image accusation. A simple interface has been developed in MATLAB software for the sake of this study. The interface and data accusation process are shown in Fig. 4.

 figure: Fig. 4.

Fig. 4. MATLAB interface and panoramic data accusation in the study.

Download Full Size | PDF

The Ladybug2 panoramic camera, which consists of 6 identical cameras, has 1/3 inch Sony ICX204AK sensors and fisheye lenses. The conventional panoramic coordinate system mainly provides a cylindrical projection. In the Ladybug2 camera, which is the camera used in this study, there is only one camera in the zenith direction. In a panoramic image with a resolution of 2048 × 1024 pixels, it is necessary to define a coordinate system that can be segmented non-uniformly in both the X and Y directions for the image coordinate system. Accordingly, when creating a panoramic coordinate system that can segment the coordinate system in both x and y directions, the preferred mathematical model is the elliptical system. In the case of conducting the same segmentation in the conventional (cylindrical) system, the parameters on the y-axis for the cylinder projection will not change. The segmentation on the y-axis of the panoramic image, which has 1024 pixels, will not be non-uniform and will undergo a scaling process resulting in shrinkage. In the elliptic system, the short side y (1024 pixel size) image coordinate system axis from the single zenith camera will be accurately modeled as a non-uniform one. From a mathematical point of view, in the elliptical system, it is predicted that a parametrically precision for a non-uniform coordinate system will be obtained compared to the single radius value in the conventional (cylindrical) system, because there are two radius values in the elliptical system. Likewise, the Ref. [29] is also considered for the fisheye projection.

Three-dimensional elliptical coordinates have been calculated from the image coordinates for the control points utilizing Eq. (1) in a 360-degree calibration room. The paraboloid surface was calculated through the Singular Value Decomposition (SVD) method, and the corresponding reference paraboloid projection function for the image plane was derived. The adoption of the SVD method stemmed from its suitability for solving symmetric matrix problems. In linear algebra, the SVD of a matrix is a factorization of that matrix into three matrices. It has well-known algebraic properties and conveys important geometrical and theoretical insights about linear transformations by Eq. (4) [30].

$$\begin{aligned} A = U\ast S\ast {V^t} = \mathop \sum \limits_{i = 1}^n ({{\sigma_i}\ast {\mu_i}\ast v_i^t} )\\ S = \left( {\begin{array}{ccc} {{\sigma_i}}& \cdots &0\\ \vdots & \ddots & \vdots \\ 0& \cdots &{{\sigma_n}} \end{array}} \right) \end{aligned}$$
where ${\sigma _i}$ singular vector elements and the rank of the matrix A is the number of non-zero singular values. Then U (orthogonal matrix) of order m x m, an V (orthogonal matrix) of order n x n, and a “pseudo-diagonal” matrix S of dimension r.

One of the reasons for choosing the SVD method here is that it is used in symmetric matrix solutions [31]. It is envisaged that the panoramic camera used will have a symmetrical structure due to the panoramic projection geometry on the plane taking 360-degree imaging. In the literature, the SVD method has been employed for data set adjusting purposes and projected orthogonally to a plane also [32]. The SVD matrix structure is given in Fig. 5.

 figure: Fig. 5.

Fig. 5. Matrix structure of SVD algorithm [33].

Download Full Size | PDF

Thus, residuals in the spherical coordinate system caused by the EOP of the camera were also normalized by SVD [31] SVD results of the projection surface are given in Table 1.

Tables Icon

Table 1. SVD results of elliptic coordinate system

After the panoramic projection for the adjusted paraboloid surface was defined from the SVD method, the hyperboloid parabolic projection matrix was determined for perspective projection of panoramic coordinates to the image plane by Eq. (5).

$$\frac{{\boldsymbol z}}{{\boldsymbol c}} = \frac{{{{\boldsymbol x}^2}}}{{{{\boldsymbol a}^2}}} - \frac{{{{\boldsymbol y}^2}}}{{{{\boldsymbol b}^2}}}$$
where $a,\; b,\; \textrm{and}\; c$ parabolic hyperboloid coefficients.

In the literature, there have been similar studies concerning the paraboloid function generated by elliptical coordinates, especially in the context of fisheye lenses [29]. For the LadyBug2 panoramic camera model, the elliptical coordinate system and the hyperboloid parabolic camera projection matrix, created using image coordinate data from 92 control points in the 360-degree calibration room have been compared with their corresponding 3D object coordinates of control points. This comparison was made with a 3D affine transformation, resulting in the determination of the detected projection residuals. Notably, the identified residuals exhibited a reduction when compared to the standard cylindrical projection utilized in the panoramic coordinate system. After SVD adjustment, the 3D projection coordinates have been calculated more accurately for 92 control points in the panoramic calibration room. In Fig. 6: (a) illustrates the 3D hyperboloid parabolic projection matrix, (b) illustrates the projection matrix in 2D, and (c) is 3D affine transformation residuals of control points in the calibration room.

 figure: Fig. 6.

Fig. 6. Definition of elliptic panoramic projection.

Download Full Size | PDF

After this process, the panoramic camera projection matrix has been detailed with each 3D coordinate system directions. The projection (camera) matrix shows that the 11 different clusters occur when the projection of a single panoramic view for the LadyBug2 panoramic camera is carried out. In Fig. 7: (a) illustrates ellipsoidal coordinates for 92 control points in the XZ direction, (b) illustrates 3D for the same points and (c) illustrates the Z differences matrix of projection via linear interpolation.

 figure: Fig. 7.

Fig. 7. Z differences matrix for the projection system.

Download Full Size | PDF

Thus, the machine learning K-Means segmentation application is proceeded by using the elliptical coordinate system defined. This K-Means approach is a centroid-based partitioned clustering method, where the centroids are the arithmetically calculated centers of the clusters and where k is the number of clusters [34]. In the calibration room, the projection matrix has 11 different Z (deep) values depending on the distance from the projection center. Hence, the cluster number of K-Means segmentation has been decided as 11 before the training process. K-Means Algorithm is composed of the following steps:

  • 1. Initially, select k centroids/cluster centers. It is preferable to locate them near the data but in the same point next to each other.
  • 2. Then, allocate each data point to the nearest centroid.
  • 3. Move the centroids to the average position of the data points allocated to them.
  • 4. Repeat the preceding two steps until the allocations don’t change.

In this study, the K-Means segmentation matrix is calculated by means of ‘Euclidean Distance’ algorithm from SVD of elliptic three-dimensional coordinates of control points as input with a thousand iterations. Then, by calculating the average value of each cluster, new cluster centers are determined and object center distances are examined again. The total squared error criterion SSE (Summed Squared Error) is most commonly used to evaluate the K-Means clustering. The clustering result with the lowest SSE value gives the best result. The sum of the squares of the distances of the objects from the central points of their respective clusters is calculated by Eq. (6) and Eq. (7) [35,36].

$$\begin{aligned} SSE = \mathop \sum \limits_{i = 1}^K \mathop \sum \limits_{x \in {C_i}} dis{t^2}({{x_i},c} )\\ d({x,c} )= 1 - \frac{{({x - \overrightarrow {\bar{x}} } ){{({c - \overrightarrow {\bar{c}} } )}^{\prime}}}}{{\sqrt {({x - \overrightarrow {\bar{x}} } ){{({x - \overrightarrow {\bar{x}} } )}^{\prime}}} \sqrt {({c - \overrightarrow {\bar{c}} } ){{({c - \overrightarrow {\bar{c}} } )}^{\prime}}} }} \end{aligned}$$
where;
$$\begin{aligned} \overrightarrow {\bar{x}} = \frac{1}{p}\left( {\mathop \sum \limits_{j = 1}^p {x_j}\; } \right)\overrightarrow {{1_p}} \\ \overrightarrow {\bar{c}} = \frac{1}{p}\left( {\mathop \sum \limits_{j = 1}^p {c_j}\; } \right)\overrightarrow {{1_p}} \end{aligned}$$
x is an observation (that is, a row of X), c is a centroid (a row vector), p is space dimension and $\overrightarrow {{1_p}} $ is a row vector of p ones.

The K-Means algorithm is a method based on the main idea that the central point represents the cluster [37]. According to the working mechanism of the K-Means algorithm, k objects are selected, each of which represents the center or mean of a cluster. The remaining objects are included in the clusters with which they are most similar, taking into account their distance from the mean values of the clusters in Fig. 8.

 figure: Fig. 8.

Fig. 8. K-Means segmentation: (a) illustrates clusters for 92 control points via K-Means segmentation, (b) illustrates K-Means cluster centers of the image plane.

Download Full Size | PDF

K-Means segmentation matrix and depth Z (deep) values are interpolated linearly with it. Thus, grid points are determined on the image for the coordinate of the non-uniform image. Non-uniform grids are created on the image with the lines passed through these grid points in Fig. 9.

 figure: Fig. 9.

Fig. 9. K-Means segmentation: (a) illustrates non-uniform grid nodes from K-Means segmentation, (b) illustrates non-uniform grid lines for K-Means segmentation.

Download Full Size | PDF

At the end of the application study, the non-uniform coordinate system has transformed to the panoramic image which is shown in Fig. 10 as below.

 figure: Fig. 10.

Fig. 10. Non-uniform image coordinate system calculated via K-Means method.

Download Full Size | PDF

4. Results and discussion

In this study, a K-means clustering–based framework is developed for the determination of a non-uniform image coordinate system. When determination of the non-uniform image coordinate grid for the suggested K-Means method, the camera matrix is expanded with K-Means grids and the projection matrix is provided for the proposed K-Means method. The coordinates of the control points are calculated with the help of three-dimensional affine transformation for the K-Means projection matrix found. The resulting projection matrix and the new three-dimensional coordinates of the calculated control points are shown in Fig. 11.

 figure: Fig. 11.

Fig. 11. Definition of K-Means derived elliptic panoramic projection: (a) illustrates projection matrix via K-Means derived in 3D, (b) illustrates projection matrix via K-Means derived in 2D, (c) is 3D affine transformation residuals of control points calculated from K-Means derived system in the calibration room.

Download Full Size | PDF

In this study, the coordinate differences between the three-dimensional control point coordinates calculated with the proposed K-Means projection matrix and the three-dimensional coordinates in the calibration room were investigated by 3D affine transformation employing Eq. (8).

$$\begin{aligned} X = {a_0} + {a_1}X + {a_2}Y + {a_3}Z\\ Y = {b_0} + {b_1}X + {b_2}Y + {b_3}Z\\ Z = {c_0} + {c_1}X + {c_2}Y + {c_3}Z \end{aligned}$$
where ${a_0}$, ${b_0}$, and ${c_0}$ are the three translation parameters and ${a_1},{a_2},{a_3},{b_1},{b_2},{b_3},{c_1},{c_2},and\; {c_3}$ are the coefficients of the transformation matrix. According to the results, the RMSE of 3D affine transformation of the coordinates obtained with the proposed K-Means projection matrix are indicated in Table 2, where the coordinates obtained with the conventional projection camera matrix give a good result of 88.40% percent compared to the residuals obtained with the object coordinates in the calibration room. The graphs of object coordinate residuals for 3D affine transformation which are calculated in three axes (a,b,c) are also given in Fig. 12.

 figure: Fig. 12.

Fig. 12. Residuals of 3D control points in the calibration room.

Download Full Size | PDF

Tables Icon

Table 2. RMSE of 3D affine transformation parameter

In the second discussion step, the image coordinates of 92 control points have been transformed to the suggested system from a conventional photogrammetric image coordinate system by 2D affine transformation because of eliminating image coordinate measurement errors about the operator for the same control point targets. Then, DLT transformation results of these two image coordinate systems have been compared to each other. Based on the results of the suggested method, the residuals of the coordinates obtained with the proposed K-Means projection matrix are indicated in Table 3, where the coordinates obtained with the conventional projection camera matrix give a good result of 84.6% percent compared to the residuals obtained with the actual coordinates in the calibration room.

Tables Icon

Table 3. DLT Parameters of K-Means Segmented Non-Uniform Grid

The Fig. 13 shows that the single-point principle point data obtained by DLT from the coordinate system conforms to the geometry of the camera matrix because of the L value scale. Image coordinate residual graph is also given in Fig. 14.

 figure: Fig. 13.

Fig. 13. DLT residuals for conventional and K-Means segmented image coordinate system.

Download Full Size | PDF

 figure: Fig. 14.

Fig. 14. Image coordinate residuals for control points via DLT.

Download Full Size | PDF

In the comparison, residuals of the 3D object coordinates which are determined from the suggested non-uniform image coordinate system with DLT are highly accurate than the conventional image coordinate system. Not only the projection center of the suggested system has reached better precision but also provided DLT parameters in a single panoramic image. The RMSE of DLT parameters are given in Table 4. The graphic for RMSE of DLT parameters are shown in Fig. 15.

 figure: Fig. 15.

Fig. 15. RMSE graphic of DLT parameters.

Download Full Size | PDF

Tables Icon

Table 4. RMSE of DLT parameters

Based on the results of the suggested method, interior orientation parameters (IOP) and exterior orientation parameters (EOP) were also derived from the Direct Linear Transformation (DLT), indicating the presence of a non-uniform image grid. IOP are shown in Table 5.

Tables Icon

Table 5. IOP values via DLT

As a result of DLT, it is observed that there is a difference of 0.4486 between the conventional method and the proposed method in the L parameter. Thus, it is seen that there are differences in meter values in the X-direction, Y-direction, and Z-direction in the three-dimensional coordinates of the camera projection center in Table 6.

Tables Icon

Table 6. EOP values via DLT

At the same time, when the cx and cy values obtained from the proposed K-Means projection are compared with the cx and cy values obtained from the conventional coordinate system, it is evident that there is a 109% percent advantage. When the accuracy of the K-means framework recommended for the Ladybug2 camera used in the application is examined, all cluster centers are detected correctly, while only cluster center number 2 (CP2) has an error rate of 1.1%. This causes the residuals in that area of the camera to be calculated higher. It is interpreted that this is a situation related to the production of cameras. To the best of the author’s knowledge, this paper is the first report about the use of the K-Means clustering algorithm in the definition of a non-uniform image coordinate system for panoramic cameras.

5. Conclusion

Three-dimensional coordinates of the signalized targets are calculated using geodetic surveying techniques in the calibration room. The coordinates estimated with conventional surveying techniques are assumed as the real coordinates, which are produced at ±0.04 mm accuracy in a real scale for setting up of the calibration room [13]. However, the accuracy of the control points is assumed without errors geometrically so that the estimated errors of the adjustment aren't dependent on the weight matrix of the target points. The camera matrix is expanded with K-Mean grids. The projection matrix is provided for the proposed K-Mean method. This means that during the determination of the K-Means suggested image coordinate system in a panoramic image, the DLT parameters could be used not only as EOPs but also as IOPs for a single panoramic view for calculation. In the DLT analysis utilizing the image coordinate system obtained through the proposed method, it was observed that the cx and cy values approached zero. This finding indicates that the hyperboloid parabolic projection matrix determined in the study expresses the projection of the Ladybug2 camera used quite well for elliptic coordinate system.

This study proves that it is possible to use the elliptic Cartesian coordinate system in panoramic cameras as well as in fisheye cameras. This issue might be solved using the K-Means setting mechanism, which is the basis of the Pearson correlation distance algorithm (just like increasing iteration number on the K-Means basis); so, the effect of errors in K-Means center differences of scale can be reduced. This study utilizes a thousand iterations for determining the K-Means segmentation matrix. As a result of the study, the general advantages and disadvantages of the presented approach for data acquisition are briefly given below.

Advantages:

  • • Accuracy for the control points in a three-dimensional object coordinate system
  • • Convenient IOP and EOP calculation potential with the DLT Method
  • • Accuracy for the DLT parameters and its residuals

Disadvantages:

  • • The experimentation is limited to the ellipsoidal panoramic coordinate system, and its applicability to other coordinate systems remains unknown.
  • • A 360-degree three-dimensional panoramic calibration area is required
  • • It has been exclusively tested on the Ladybug2 camera, and its efficacy on various other panoramic cameras is yet to be investigated.

In conclusion, in the process of determining the suggested image coordinate system for panoramic cameras, the DLT parameters could be used for a single panoramic view. They are directly used in the orientation process without a conventional camera file. In the future, the software that is used for panoramic images might be developed with this approach. To achieve this, the suggested non-uniform coordinate system needs to be integrated into the digital camera at first. Then, the digital camera can be applicable to various applications such as space studies about navigation and close-range photogrammetric image processing. This suggested method can be tested with different types of cameras such as catadioptic for the definition of a non-uniform image coordinate system in a single panoramic view. This technical prudence can also be beneficial in the close-range photogrammetric applications for architectural and cultural heritage documentation.

In the future, this approach can be easily integrated into any panoramic sensor technology in survey imaging approaches including photogrammetry and machine vision. Moreover, the results of the elliptic coordinate system created by panoramic techniques for the non-uniform image coordinate system to be created by different machine learning segmentation methods (C-Means, SVM, etc.) on the panoramic image will be investigated.

Disclosures

The author declares no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the author upon reasonable request.

References

1. O. Faugeras, Panoramic Vision: Sensors, Theory, and Applications (Springer Science & Business Media, 2001).

2. T. Luhmann, “A historical review on panorama photogrammetry,” in Proceedings of International Society for Photogrammetry and Remote Sensing (2004).

3. L. F. Castanheiro, “Geometric model of a dual-fisheye system composed of hyper-hemispherical lenses,” Master’s Thesis, School of Sciences and Technology of São Paulo State University, Brazil (2020).

4. R. Pless, “Using many cameras as one,” in Computer Society Conference on Computer Vision and Pattern Recognition (IEEE,2003).

5. J. Bosch, N. Gracias, P. Ridao, et al., “Omnidirectional underwater camera design and calibration,” Sensors 15(3), 6033–6065 (2015). [CrossRef]  

6. M. B. Campos, A. M. G. Tommaselli, E. Honkavaara, et al., “A backpack-mounted omnidirectional camera with off-the-shelf navigation sensors for mobile terrestrial mapping: Development and forest application,” Sensors 18(3), 827 (2018). [CrossRef]  

7. R. Wang, F. P. Ferrie, and J. Macfarlane, “Automatic registration of mobile LiDAR and spherical panoramas,” in Computer Society Conference on Computer Vision and Pattern Recognition Workshops (IEEE, 2012), pp. 33–40.

8. F. M. Mirzaei, D. G. Kottas, S. I. Roumeliotis, et al., eds. (2016), pp. 183–200.

9. A. S. Kundu, O. Mazumder, A. Dhar, et al., “Scanning camera and augmented reality based localization of omnidirectional robot for indoor application,” Procedia. Comput. Sci. 105, 27–33 (2017). [CrossRef]  

10. K. Kwiatek and R. Tokarczyk, “Photogrammetric applications of immersive video cameras,” ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. II-5, 211–218 (2014). [CrossRef]  

11. P. Firoozfam, “Multi-camera imaging for 3D mapping and positioning: Stereo and panoramic conical views,” PhD. Thesis, University of Miami, USA (2004).

12. A. Ladai, C. Toth, and Z. Tóth, “Indoor Mapping with AN Omnidirectional Camera System: Performance Analysis,” Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLIII-B1-2022, 347–352 (2022). [CrossRef]  

13. C. Sahin and B. Ergun, “Indoor stereo photogrammetry via omnidirectional multicamera system case study: Ladybug2,” in Physical and Chemical Sensors: Design, Applications & Networks, S. Y. Yurish, eds. (2019), pp. 197–224.

14. O. Lamarre, O. Limoyo, F. Marić, et al., “The Canadian planetary emulation terrain energy-aware rover navigation dataset,” Int. J. Robot. Res. 39(6), 641–650 (2020). [CrossRef]  

15. M. Hirabayashi, K. Kurosawa, R. Yokota, et al., “Flying object detection system using an omnidirectional camera,” Forensic Sci. Int. 35, 301027 (2020). [CrossRef]  

16. M. B. Campos, A. M. G. Tommaselli, J. Marcato Junior, et al., “Geometric model and assessment of a dual-fisheye imaging system,” Photogramm. Rec. 33(162), 243–263 (2018). [CrossRef]  

17. B. Ergun, T. Kavzoglu, I. Colkesen, et al., “Data filtering with support vector machines in geometric camera calibration,” Opt. Express 18(3), 1927–1936 (2010). [CrossRef]  

18. J. B MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of Fifth Berkeley Symposium on Mathematical Statistics and Probability (1967), pp. 281–297.

19. A. M. Ikotun, A. E. Ezugwu, L. Abualigah, et al., “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Inf. Sci. 622, 178–210 (2023). [CrossRef]  

20. A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognit. Lett. 31(8), 651–666 (2010). [CrossRef]  

21. M. D. S. Lubis, H. Mawengkang, and S. Suwilo, “Performance analysis of entropy methods on K means in clustering process,” J. Phys.: Conf. Ser. 930(1), 012028 (2017). [CrossRef]  

22. R. C. D. Santos, M. Galo, and V. M. Tachibana, “Classification of LiDAR data over building roofs using k-means and principal component analysis,” Bol. Ciênc. Geod. 24(1), 69–84 (2018). [CrossRef]  

23. P. Lemenkova and O. Debeir, “R Libraries for remote sensing data classification by k-means clustering and NDVI computation in Congo River Basin, DRC,” Appl. Sci. 12(24), 12554 (2022). [CrossRef]  

24. R. Hyde, R. Hossaini, and A. A. Leeson, “Cluster-based analysis of multi-model climate ensembles,” Geosci. Model Dev. 11(6), 2033–2048 (2018). [CrossRef]  

25. S. Liu, L. Zhao, J. Li, et al., “Multi-resolution panoramic modeling based on spherical projective geometry,” in Proceedings of 2nd International Conference on Computer Science and Network Technology (2012), pp. 2171–2174.

26. B. Ergun, S. Kulur, A. Alkis, et al., “Three dimensional calibration room design and application for architectural documentation methods,” in Proceedings of 22nd CIPA Symposium (2009).

27. D. Schneider and E. Schwalbe, “Design and testing of mathematical models for a full-spherical camera on the basis of a rotating linear array sensor and a fisheye lens,” in Proceedings of 7th Conference on Optical 3-D Measurement Techniques (2005), pp. 245–254.

28. W. Schweizer, Special Functions in Physics with MATLAB (Springer Cham, 2021).

29. H. Zhu, X. Wang, and C. Yi, “An elliptical function model for fisheye camera correction,” in Proceedings of 9th World Congress on Intelligent Control and Automation (2011), pp. 248–253.

30. T. Satogata, “SVD Orbit Correction for ALPHA,” (2014), http://toddsatogata.net/Papers/TN-14-030.pdf.

31. V. Guruswami and R. Kannan, Computer Science Theory for the Information Age, Carnegie Mellon University (2018), https://www.cs.cmu.edu/∼venkatg/teaching/CStheory-infoage/hopcroft-kannan-feb2012.pdf.

32. F. Wang, A. Louys, N. Piasco, et al., “PlaNeRF: SVD Unsupervised 3D Plane Regularization for NeRF Large-Scale Scene Reconstruction,” arXiv, arXiv:2305.16914v3 (2023). [CrossRef]  

33. JM Phillips, Data Mining: Algorithms, Geometry, and Probability, University of Utah, https://users.cs.utah.edu/∼jeffp/DMBook/L14-SVD.pdf.

34. P. N. Tan, M. Steinbach, A. Karpatne, et al., Introduction to Data Mining, 2nd ed. (Pearson Education, 2019).

35. A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM Comput. Surv. 31(3), 264–323 (1999). [CrossRef]  

36. L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis (John Wiley & Sons, 1990).

37. U. M. Fayyad, G. Piatetsky-Shapiro, R. Smyth, et al., Advances in Knowledge Discovery and Data Mining (AAAI/MIT Press, 1996).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the author upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (15)

Fig. 1.
Fig. 1. Flowchart of the study.
Fig. 2.
Fig. 2. The coordinate systems.
Fig. 3.
Fig. 3. Structure of coordinate systems in panoramic imaging.
Fig. 4.
Fig. 4. MATLAB interface and panoramic data accusation in the study.
Fig. 5.
Fig. 5. Matrix structure of SVD algorithm [33].
Fig. 6.
Fig. 6. Definition of elliptic panoramic projection.
Fig. 7.
Fig. 7. Z differences matrix for the projection system.
Fig. 8.
Fig. 8. K-Means segmentation: (a) illustrates clusters for 92 control points via K-Means segmentation, (b) illustrates K-Means cluster centers of the image plane.
Fig. 9.
Fig. 9. K-Means segmentation: (a) illustrates non-uniform grid nodes from K-Means segmentation, (b) illustrates non-uniform grid lines for K-Means segmentation.
Fig. 10.
Fig. 10. Non-uniform image coordinate system calculated via K-Means method.
Fig. 11.
Fig. 11. Definition of K-Means derived elliptic panoramic projection: (a) illustrates projection matrix via K-Means derived in 3D, (b) illustrates projection matrix via K-Means derived in 2D, (c) is 3D affine transformation residuals of control points calculated from K-Means derived system in the calibration room.
Fig. 12.
Fig. 12. Residuals of 3D control points in the calibration room.
Fig. 13.
Fig. 13. DLT residuals for conventional and K-Means segmented image coordinate system.
Fig. 14.
Fig. 14. Image coordinate residuals for control points via DLT.
Fig. 15.
Fig. 15. RMSE graphic of DLT parameters.

Tables (6)

Tables Icon

Table 1. SVD results of elliptic coordinate system

Tables Icon

Table 2. RMSE of 3D affine transformation parameter

Tables Icon

Table 3. DLT Parameters of K-Means Segmented Non-Uniform Grid

Tables Icon

Table 4. RMSE of DLT parameters

Tables Icon

Table 5. IOP values via DLT

Tables Icon

Table 6. EOP values via DLT

Equations (8)

Equations on this page are rendered with MathJax. Learn more.

X = 1 2 R μ 2 + v 2 μ 2 v 2 1 cos π ≤≤ π Y = 1 2 R μ 2 + v 2 μ 2 v 2 1 sin 1 μ Z = 1 2 R μ v 1 v 1 R 0 a n d c o s t a n t
r 1 2 = x 2 + y 2 + ( z + R 2 ) 2 r 2 2 = x 2 + y 2 + ( z R 2 ) 2
μ = r 1 + r 2 2 v = r 1 r 2 2
A = U S V t = i = 1 n ( σ i μ i v i t ) S = ( σ i 0 0 σ n )
z c = x 2 a 2 y 2 b 2
S S E = i = 1 K x C i d i s t 2 ( x i , c ) d ( x , c ) = 1 ( x x ¯ ) ( c c ¯ ) ( x x ¯ ) ( x x ¯ ) ( c c ¯ ) ( c c ¯ )
x ¯ = 1 p ( j = 1 p x j ) 1 p c ¯ = 1 p ( j = 1 p c j ) 1 p
X = a 0 + a 1 X + a 2 Y + a 3 Z Y = b 0 + b 1 X + b 2 Y + b 3 Z Z = c 0 + c 1 X + c 2 Y + c 3 Z
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.