Abstract
Programmable linear optical interferometers are important for classical and quantum information technologies, as well as for building hardware-accelerated artificial neural networks. Recent results showed the possibility of constructing optical interferometers that could implement arbitrary transformations of input fields even in the case of high manufacturing errors. The building of detailed models of such devices drastically increases the efficiency of their practical use. The integral design of interferometers complicates its reconstruction since the internal elements are hard to address. This problem can be approached by using optimization algorithms [Opt. Express 29, 38429 (2021) [CrossRef] ]. In this paper, we present what we believe to be a novel efficient algorithm based on linear algebra only, which does not use computationally expensive optimization procedures. We show that this approach makes it possible to perform fast and accurate characterization of high-dimensional programmable integrated interferometers. Moreover, the method provides access to the physical characteristics of individual interferometer layers.
© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
1. Introduction
The use of classical and quantum properties of light opens up new possibilities for information processing methods [1]. Programmable optical circuits play an important role in classical and quantum optical communications [2–5], in quantum computing [6–8], and in machine learning problems [9–11]. The key element of such devices is a programmable linear optical interferometer. When control parameters are fixed, it performs a linear transformation of $N$ input fields:
One can tune the interferometer to some target unitary transformation $V$ in two ways. The first way is empirical. For each $V$, one optimizes the control parameters in order to bring the experimental results of optical fields measurements closer to the target ones [17]. This does not require building an exact model of interferometer internal structure, but is impractical, since it is necessary to measure all matrix elements at each step. For each new target matrix $V$, the procedure must be repeated. Another way is to first build a complete mathematical model of the interferometer based on the results of some set of measurements [18]. Then, the search for the desired set of control parameters is performed without any additional measurements. This procedure faces some difficulties. Since large-scale linear optical devices can be implemented in the integral design only, it is almost impossible to split the large-scale reconstruction problem into several low-scale ones. It needs additional optical outputs or modular optical circuits [19] that causes additional losses and instability. Trying to characterize an integrated device as a whole, one has to face the fact that the results of each measurement are determined by a large number of internal device parameters. As a result, the efficiency of the optimization procedure rapidly decreases with increasing dimension $N$.
In this paper, we present a method for constructing a model of a programmable integrated interferometer based on algebraic operations only. This approach needs limited number of operations and is much more computationally efficient than the one using numerical optimization. The method is based on the layered interferometer architecture [15], which is described in Section 2. Further in Section 3, we describe the procedure for reconstructing interferometer mixing layers. As in [18], it requires the measurement of all matrix elements of the complex transfer matrix $V$ (up to a global phase) for a given configuration of phase layers. Common methods for measuring the transfer matrices of optical interferometers using laser [20], thermal [21], and two-photon [22,23] fields make it possible to measure matrix elements up to input and output phases only. A non-ambiguous measurement of the matrix $V$ can be performed using homodyne detection of all the input and output coherent fields in Eq. (1) and requires $O(N^2)$ measurements [24]. In Section 4, we show that the proposed algebraic approach allows one to predict the transformation matrix $V$ for an arbitrary configuration of phase layers with an accuracy not lower than that for the method based on numerical optimization. The mixing layers are determined ambiguously, however, in the case of homogeneous losses, our method makes it possible to uniquely determine the normalized transmission coefficients of each individual layer. In addition, we show that the method is highly robust to phase layers control errors.
2. Interferometer model
Below we consider the universal layered interferometer architecture consisting of alternating mixing and phase layers [15,18] (Fig. 1). The corresponding transfer matrix has the form
Here $U_k$ and $\Phi _k$ are transfer matrices of the mixing and phase layers, respectively. Each mixing layer is an $N$-channel transformation that sufficiently distributes the signal of each input channel to all the output channels. All matrices $U_k$ can be different. Ideally, they are unitary, but this property is lost in the presence of linear losses. For the phase layer, each channel is subjected to a controlled phase shift through a thermo- or electro-optical effect. The corresponding transfer matrix has a diagonal form:
Expression (2) can also be used to describe any other interferometer architecture (for example, the Reck [12] or Clements [13] architecture). In this case, $U_k$ can be the transfer matrices of beam splitters acting on two channels only, and the phase layers do not control all channels, but only a few ones.
With the number of mixing layers $K=N$, the layered architecture makes it possible to obtain an arbitrary transformation $V$ due to the proper choice of phases $\varphi _n^{(k)}$. We mostly consider this particular case.
The problem of constructing an interferometer model is reduced to determining all transfer matrices $U_k$ and $\Phi _k$. The dependence of $\varphi _n^{(k)}$ on the control parameters can be pre-calibrated even for unknown $U_k$. For this, a simple $\sin \varphi$ dependence of the interferometer output intensity on the value $\varphi$ of any single controlled phase is used (all other phases are disabled) [2]. Below we assume that all phase layers are pre-calibrated (later we will show that our reconstruction method is robust to phase control errors). Reconstruction of the matrices $U_k$ is complicated by the fact that there is no way to “turn off” all mixing layers, except for one, and perform its measurement. However, in the next section, we show that it is possible to obtain a matrix equation that depends on a single mixing layer only by an appropriate choice of phase layers configurations. This equation is reduced to an eigendecomposition problem, which is solved algebraically.
3. Mixing layers reconstruction
In this section, we assume that we have full control over the phase layers. Later in Section 4, we show that the proposed method is robust to control errors. We also assume that none of the mixing layers has absolute losses in any channel, so all the matrices $U_1, \dots U_K$ are invertible. We also do not restrict them to be unitary.
We describe the reconstructing algorithm in several steps. First, we consider the ideal case when, for any configuration of the interferometer, we know exactly its full transfer matrix $V$. Next, we consequentially take into account the impossibility of experimental determination of the global phase of $V$, the presence of errors in measurements and phase layers control, and the special case of homogeneous losses inside the interferometer. Finally, we present the pseudo-code for the complete algorithm. Its software implementation is available online [25].
3.1 Ideal case
Consider two configurations of the interferometer phase layers. The first configuration has all the phase layers disabled. The corresponding transfer matrix is $V_0=U_K\dots U_2U_1$. Next, the phases $\{\varphi _n, n=1,\dots,N\}$ are set in the second phase layer: $V_1=U_K\dots U_2\Phi U_1$, where $\Phi =\operatorname {diag}[\exp (i\varphi _n)]$. Consider the matrix $A_1=V_0^{-1}V_1=U_1^{-1}\Phi U_1$. Since the matrix $\Phi$ is known and diagonal, this equation is solved with respect to $U_1$ by the diagonalization of $A_1$. By induction, one can iteratively obtain all mixing matrices up to $U_{K-1}$. In particular, we enable only the $(k+1)$-th phase layer to obtain the $k$-th mixing layer:
Using $V_k$ and the results for all previous $k-1$ matrices we compute the following matrix:
The following equation determines its eigendecomposition:
Since eigenvectors diagonalize $A_k$, we can use it to construct $U_k^{-1}$. To do this, one must sort the solutions of Eq. (6) in such a way that matches eigenvalues with the diagonal of $\Phi$. Let us define the sorting indices $S=\{s_n: \lambda _{s_n}=\exp (i\varphi _n)\}$. Then the inverse transfer matrix has the form
This procedure goes successively for $k=1,\dots,K-1$. Finally, the transfer matrix of the last mixing layer is $U_K=V_0U_1^{-1}U_2^{-1}\dots U_{K-1}^{-1}$. The procedure does not involve the first and last phase layers.
Note that for the correct work of the described method it is necessary that all $\lambda _q$ from Eq. (6) (so the phases $\varphi _n$) must be different. Eigenvectors with equal eigenvalues cannot be uniquely defined, this leads to errors in constructing the columns of $U_k^{-1}$.
3.2 Ambiguity
The described algorithm determines all mixing layers of the interferometer up to following transformations:
3.3 Effect of the global phase
The approach described above gives an exact solution, when the matrices $V_k$ are exactly known. In practice, the elements of the transfer matrices are measured experimentally and, therefore, differ from the original ones. These differences are primarily related to the undetectable global phase. In this regard, in Eq. (6), we obtain eigenvalues $\lambda _q=\exp (i\delta _q)$, where the set of phases $\{\delta _q\}$ differs from the original set $\{\varphi _n\}$ by some constant value $\chi$. It makes the matching between $\{\delta _q\}$ and $\{\varphi _n\}$ ambiguous and could violate the sorting of eigenvectors. For example, a set of phases
from Fig. 3(a) is invariant under $2\pi /N$ rotation and can’t be used to determine $\chi$ in a unique way. This problem is solved by an appropriate choice of controllable phases such that $\{\varphi _n+\chi \ (\mathrm {mod}\ 2\pi )\} \neq \{\varphi _n\}$ for any non-trivial $\chi$. Consider the example presented in Fig. 3(b):One can observe that this set is invariant under the full $2\pi$ rotations only. Further, it is convenient to move from absolute phases to relative phases between adjacent channels of the phase layer (channel $N$ is considered adjacent to channel $1$):
It is easy to see that the relative phase for the first channel is twice the others for the choice (Eq. (11)):
We can then calculate the sorting indices of eigenvalues from Eq. (6) by relative phases that are insensitive to the global phase $\chi$:Systematic and statistical errors also distort the spectrum of the matrix $A_k$, leading to more general eigenvalues: $\lambda _q=r_q\exp (i\delta _q)$, where $r_q\approx 1$, $\delta _q\approx \varphi _q+\chi$. If the errors are not too large then we can again use the above approach, but instead of an exact equality we look for an approximate one:
Note that in practice this minimization is not necessary. The set Eq. (11) is sorted in ascending order, so one could also sort all $\delta _q$ and then find the maximum value of $\Delta \delta _q$. In view of Eq. (13), this value corresponds to the index $s_1$. The remaining elements will follow it on the unit circle in the anti-clockwise direction (Fig. 3(c)).
3.4 Noise handling
The described algorithm works well on small dimensions. However, as $N$ increases, the density of $\{\varphi _n\}$ values on the unit circle increases. As a result, the eigenvalue sorting rule becomes more noise sensitive (Fig. 3(d)). One can reduce this effect by activating only a few phases at a time and calculating only a few columns of the matrix $U_k^{-1}$. Denote by $C=\{c_m\}$ the set of indices of these columns and select the following set of phases (Fig. 4(a)):
The smaller the number $M$ of simultaneously estimated columns of $U_k^{-1}$, the less densely the corresponding phase values can be located. As a result it is possible to resolve the individual eigenvalues of the matrix $A_k$ from Eq. (6) quite well even for large values of dimension $N$. In the limit $M=1$, we reconstruct the columns of the matrix $U_k^{-1}$ one by one. This, however, complicates the experiment: it is required to measure $\propto N/M$ transfer matrices for each mixing layer. The total number of transfer matrices used is $L=(K-1)\lceil N/M \rceil + 1$.
Note that in the above algorithm, the phases $\delta _q$ of $A_k$ eigenvalues were used for the correct eigenvectors sorting only. At the same time, the absolute values of $\delta _q$ contain information about the true values of the controlled phases $\varphi _n$ at the phase layer. These values can serve for more accurate calibration of the phase layer, taking its cross-talk into account.
3.5 Case of homogeneous losses
The described procedure does not impose any constraints on resulting matrices $U_k$. In certain cases, it may be known a priory that the interferometer losses are homogeneous. Then each $U_k$ belongs to the set of unitary matrices up to a constant factor $\eta _0$ describing the homogeneous losses. Thus, one can enhance the accuracy by projecting each $U_k$ to this set. For some non-unitary matrix $X$, the projection is given by
Here matrices $Q$, $S$ and $W$ form the singular value decomposition of $X$. The matrices $Q$ and $W$ are unitary, while $S$ is non-negative diagonal matrix and equals to the identity matrix for unitary $X$. Within the unitary model, the transformations $\Lambda _k$ from Eq. (8) are limited by arbitrary phase shifts only (there are no linear loss blocks in Fig. 2). This ambiguity does not affect the normalized transmission coefficients of mixing layers
Thus, this coefficients are defined unambiguously. Below we consider the case $\eta _0=1$ since homogeneous losses could be measured separately and compensated by collecting more data.
3.6 Algorithm pseudo-code
Algorithm 1 summarizes the described procedure of mixing layers reconstructions. It takes the following parameters as the input: $N$ – the number of input and output channels of the interferometer; $K$ – the number of mixing layers; $M$ – the maximal number of $U_k^{-1}$ columns, estimated at once. The output of the algorithm are transfer matrices $U_k$ ($k=1,\dots,K$) of all mixing layers.
The algorithm software implementation is available online [25].
4. Numerical simulation
In this section, we analyze the efficiency of the proposed algorithm in the presence of various kinds of errors. In addition, we compare it with the optimization algorithm from [18].
4.1 Error models
We define random mixing layers in two steps. In the first step, we generate a random unitary matrix
Further we assume $\gamma = 0.1$. In the second step, if necessary, we introduce heterogeneous linear losses:
In other words, we insert random linear losses of no more than $\beta$ into the “middle” part of each random unitary transformation. Random matrices of the form (21) are generated independently for each mixing layer to create a true model of the programmable interferometer. We denote these matrices as $U_k^{\text {true}}$.
We also simulate the errors of phase layers control. Let $\{\varphi _n\}$ be the desired set of phases in the $k$-th phase layer. The relative control error of each phase is given by $\alpha$:
We simulate the statistical noise of the transfer matrix measurements by a random Gaussian error for each matrix element:
4.2 Benchmarks
After the numerical experimental data generation we reconstruct the interferometer model. Then we estimate the ability of the model to predict the full transfer matrix given arbitrary phase layer configuration (each phase of each phase layer is chosen randomly from $0$ to $2\pi$) by calculating the fidelity between the predicted $V_{\text {pred}}$ and true $V_{\text {true}}$ transfer matrices of the entire interferometer (the normalized squared dot product of the matrices):
The quantity $\Delta F = 1 - F$, in turn, plays the role of the distance between the matrices, which is insensitive to the value of the undetectable global phase (similar to the distance between two quantum processes [26]). We consider the $95$-th percentile $\Delta F_{95}$ in a test set of $1000$ random phase layer configurations [27]. Note that the value of $\Delta F_{95}$ is affected not only by the quality of the mixing layers reconstruction, but also by the presence of phase layer control errors in $V_{\text {true}}$, which are assumed to be zero in $V_{\text {pred}}$.
In Section 3.5, we pointed out that the method makes it possible to uniquely determine the normalized transmission coefficients (Eq. (18)) of mixing layers. Therefore, we also consider the measure
It characterizes the maximal error of transmission coefficients estimation over the all mixing layers. Here $T_k^{\text {true}}$ and $T_k^{\text {rec}}$ are respectively the true and reconstructed transmission matrices of the $k$-th mixing layer. Both $\Delta F_{95}$ and $\Delta T_{\max }$ are insensitive to homogeneous losses. Considering $\Delta T_{\max }$ in addition to $\Delta F_{95}$ is useful for two reasons. First, $\Delta F_{95}$ is an integral estimate that includes both errors in determining mixing layers and errors in controlling phase layers. Since the latter remains unknown, $\Delta F_{95}$ may be quite high even if each mixing layer is determined accurately. Secondly, the determination of the transmission coefficient is of independent interest for debugging the manufacturing technology of integrated interferometers. The use of $\Delta T_{\max }$, however, makes sense for the unitary model only (see Section 3.5). In the presence of heterogeneous losses, the matrices $T_k^{\text {rec}}$ are determined ambiguously and it makes no sense to compare them directly with $T_k^{\text {true}}$.
4.3 Comparison with optimization method
Let us consider the case when the number of interferometer channels $N$ and the number of mixing layers $K$ are 5 ($N = K = 5$), and there are no linear losses ($\beta =0$) and phase control errors ($\alpha =w=0$). We generate a random programmable interferometer according to Section 4.1 and perform the reconstruction of all its mixing layers according to Section 3. As in [18], we use the model of unitary mixing layers. The initial data used in the algorithm is also fed to the input of the algorithm from [18] with an open source code. We use the version of this algorithm that uses the approximate prior knowledge of mixing layers transfer matrices. Following the example from the library files, the following algorithm parameters are chosen: mini-batch size is five, the prior knowledge coefficient $\beta =0.24$. The number of training epochs varies from $20$ to $1500$ depending on the dimension $N$ and the measurement protocol (the value is high enough to reliably rich the accuracy saturation).
In total, we considered 3 different measurement protocols:
- 1. First, all phases are set to zero. Next, $20$ configurations are considered. In each one a single non-zero phase is set equal to $2\pi /3$. Each of these $20$ configurations allows one to define $M = 1$ column of the inverse transfer matrix of one mixing layer (see Section 3.4). The total number of configurations is $L = 21$.
- 2. First, all phases are set to zero. Next, $4$ configurations are considered. In each one the phases (Eq. (11)) are set on one of the phase layers. Each of these $4$ configurations allows one to determine all $M = 5$ columns of the inverse transfer matrix of one mixing layer (see Section 3.4). The total number of configurations is $L = 5$.
- 3. A set of $L = 300$ random interferometer configurations is considered (only for the optimization algorithm from [18]).
Figure 5 shows the simulation results. In the absence of phase layers control errors, the dependencies of $\Delta T_{\max }$ (Fig. 5(a)) and $\Delta F_{95}$ (Fig. 5(b)) on the measurement error $\varepsilon$ turned out to be close, regardless of the measurement protocol and the reconstruction method used. The closeness is so high that plotting them in the same axes on a log scale make the distinction barely seen. Approximately, we have observed $\Delta T_{\max }\propto \varepsilon$ and $\Delta F_{95}\propto \varepsilon ^2$. From Fig. 5(c) it can be seen that the presence of phase layer control errors $\alpha$ does not affect our method’s ability to estimate the mixing layer transmission coefficients $[T_k]_{mn}$: it is limited by the interferometer measurements accuracy only. This is because the algorithm uses the phases values only to do the correct sorting. However, starting from a certain critical value of the error $\alpha _c$, the accuracy of determining $[T_k]_{mn}$ sharply decreases, since the sorting is violated. The value of $\alpha _c$ can be increased by decreasing the number of unique phases at each iteration (it is defined by the parameter $M$ of the algorithm). The optimization algorithm from [18] relies on the particular phase values, so the control errors impair the estimation of $[T_k]_{mn}$. Despite this, the algorithm manages to train the entire interferometer as a whole with an error $\Delta F_{95}$ not higher than that provided by our algorithm (Fig. 5(d)). Approximately, we have observed $\Delta F_{95}\propto \alpha ^2$.
Despite the closeness of the results for optimization algorithm and our algorithm, the latter performs much faster. To show this, consider two extreme versions of our algorithm ($M = N$ and $M = 1$). For the optimization algorithm from [18], we use a set of 300 training phase layer configurations. Starting from a certain value, changing this number does not affect the computation speed significantly. The number of epochs is chosen to be minimal in order to guarantee a correct result. On the range of considered dimensions $N$, the presented reconstruction algorithm worked more than two orders of magnitude faster (Fig. 6). We also considered the optimization algorithm in the “black box” model (the absence of any prior knowledge about the mixing layers). In this case, as expected, there was a rapid growth in computational complexity with an increase in $N$ from two to six (as in [18], we also failed to obtain the convergence of the optimization algorithm in the “black box” model for $N > 6$). Our algorithm, also based on the “black box” model, showed a much slower increase in computation time. The independent trials of numerical experiments were performed in parallel on 52 CPU cluster; Intel Xeon CPU E5450 @ 3.00GHz.
4.4 Algorithm scalability
Above we have mentioned that the total number of required interferometer configurations for our method is $L=(K-1)\lceil N/M\rceil +1$, where $N$ is the number of input and output channels, $K$ is the number of mixing layers, and $M\in [1;N]$ is the number of simultaneously estimated columns of the inverse matrix of a single mixing layer. The time required to measure the transfer matrix for each configuration is estimated as $O(N^2)$ [24]. Therefore, for $K=N$ the total measurement time scales as $O(N^4/M)$.
The reconstruction algorithm requires $O(L)$ algebraic operations including matrix multiplication, taking the inverse matrix, and computing matrix eigendecomposition. The complexity of these procedures does not exceed $O(N^3)$ [28]. For $K=N$ the total complexity of the entire procedure can be estimated as $O(N^5/M)$.
From these estimates, one can see the number $M$ to be an important parameter that affects the algorithm speed. The value $M = N$ is optimal from the point of view of measurement and computation speed. However, in this case, with increasing $N$, the sensitivity to control errors of phase layers increases: if the error $\alpha$ is higher than the critical value $\alpha _c$, the mixing layers estimates accuracy sharply decreases. At the same time, for a fixed $M$, the value of $\alpha _c$ changes slightly with increasing $N$ (Fig. 7(a)). Note also that for $\alpha <\alpha _c$ transmission coefficients matrices $T_k$ are determined with approximately the same accuracy for all mixing layers (Fig. 7(b)). That is, despite the iterative nature of the algorithm (see Eq. (5)), there is no effect of error accumulation.
The prediction error $\Delta F_{95}$ grows polynomially with increasing $N$ (Fig. 7(c)). This is primarily because at a fixed data acquisition time, the size $L$ of the measurement protocol increases. As a result, the measurement accuracy of each transfer matrix decreases (see Eq. (24)), impairing the mixing layers estimation accuracy. The figure shows that one can slightly improve the accuracy by performing the projection of the mixing layers to the set of unitary matrices (Section 3.5). This, however, does not affect the rate of growth (about $\Delta F_{95} \propto N^4$). When the heterogeneous linear losses are present, the rate of growth becomes higher (about $\Delta F_{95} \propto N^{5.8}$ for $\beta =0.1$). This can be explained by the fact that the condition number of the transfer matrices of the mixing layers increases when heterogeneous losses occur. As a result, the matrix inversion procedure becomes more sensitive to statistical fluctuations.
5. Conclusion
We have shown that the estimation of mixing layers in the layered architecture of the programmable integrated optical interferometer is possible without any use of numerical optimization. The proposed algorithm has a single parameter $M$ "- the number of simultaneously estimated columns of a mixing layer inverse transfer matrix. We have performed the numerical simulation and compared the results with the results obtained using the optimization algorithm. It has been found that under the equal conditions both methods provide the same accuracy, regardless of the measurement protocol. At the same time, the exclusively algebraic approach in our algorithm has shown a significantly lower computation time.
Moreover, our algorithm uses the control phases values only to correctly sort the columns of the adjacent mixing layer. This makes the method almost insensitive to phase control errors up to some critical error value. This critical value can be increased by choosing a smaller value of the parameter $M$. Small $M$, however, increases the number of operations in the reconstruction algorithm, so in practice it is worth choosing an optimal $M$ based on the experimental accuracy of phases control. We believe that extending the algorithm to consider the absolute values of the phases can help in determining the phase control errors. This is the subject of further research.
Finally, note that any interferometer architecture (for example, the Reck [12] or Clements [13] architecture) can be brought to the layered architecture considered here by adding controllable phase shifters in some channels. However, the possibility to directly apply our method to an arbitrary architecture remains unclear and is also the subject of further research.
Funding
Russian Science Foundation (22-12-00263).
Acknowledgments
This work was supported by Russian Science Foundation (RSF), project No: 22-12-00263. We also thank I. Dyakonov for valuable discussions and G. Struchalin for help in conducting computations.
Disclosures
The authors declare no conflicts of interest.
Data Availability
Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
References
1. P. Minzioni, C. Lacava, T. Tanabe, et al., “Roadmap on all-optical processing,” J. Opt. 21(6), 063001 (2019). [CrossRef]
2. J. Carolan, C. Harrold, C. Sparrow, E. Martín-López, N. J. Russell, J. W. Silverstone, P. J. Shadbolt, N. Matsuda, M. Oguma, M. Itoh, G. D. Marshall, M. G. Thompson, J. C. F. Matthews, T. Hashimoto, J. L. O’Brien, and A. Laing, “Universal linear optics,” Science 349(6249), 711–716 (2015). [CrossRef]
3. N. C. Harris, J. Carolan, D. Bunandar, M. Prabhu, M. Hochberg, T. Baehr-Jones, M. L. Fanto, A. M. Smith, C. C. Tison, P. M. Alsing, and D. Englund, “Linear programmable nanophotonic processors,” Optica 5(12), 1623–1631 (2018). [CrossRef]
4. J. Wang, S. Paesani, Y. Ding, R. Santagati, P. Skrzypczyk, A. Salavrakos, J. Tura, R. Augusiak, L. Mančinska, D. Bacco, D. Bonneau, J. W. Silverstone, Q. Gong, A. Acín, K. Rottwitt, L. K. Oxenløwe, J. L. O’Brien, A. Laing, and M. G. Thompson, “Multidimensional quantum entanglement with large-scale integrated optics,” Science 360(6386), 285–291 (2018). [CrossRef]
5. J. Wang, F. Sciarrino, A. Laing, and M. G. Thompson, “Integrated photonic quantum technologies,” Nat. Photonics 14(5), 273–284 (2020). [CrossRef]
6. I. Schwartz, D. Cogan, E. R. Schmidgall, Y. Don, L. Gantz, O. Kenneth, N. H. Lindner, and D. Gershoni, “Deterministic generation of a cluster state of entangled photons,” Science 354(6311), 434–437 (2016). [CrossRef]
7. H.-S. Zhong, Y. Li, W. Li, et al., “12-photon entanglement and scalable scattershot boson sampling with optimal entangled-photon pairs from parametric down-conversion,” Phys. Rev. Lett. 121(25), 250505 (2018). [CrossRef]
8. W. Asavanant, Y. Shiozawa, S. Yokoyama, B. Charoensombutamon, H. Emura, R. N. Alexander, S. Takeda, J.-i. Yoshikawa, N. C. Menicucci, H. Yonezawa, and A. Furusawa, “Generation of time-domain-multiplexed two-dimensional cluster state,” Science 366(6463), 373–376 (2019). [CrossRef]
9. R. Hamerly, L. Bernstein, A. Sludds, M. Soljačić, and D. Englund, “Large-scale optical neural networks based on photoelectric multiplication,” Phys. Rev. X 9(2), 021032 (2019). [CrossRef]
10. G. Wetzstein, A. Ozcan, S. Gigan, S. Fan, D. Englund, M. Soljačić, C. Denz, D. A. Miller, and D. Psaltis, “Inference in artificial intelligence with deep optics and photonics,” Nature 588(7836), 39–47 (2020). [CrossRef]
11. H. Zhang, M. Gu, X. D. Jiang, J. Thompson, H. Cai, S. Paesani, R. Santagati, A. Laing, Y. Zhang, M. H. Yung, Y. Z. Shi, F. K. Muhammad, G. Q. Lo, X. S. Luo, B. Dong, D. L. Kwong, L. C. Kwek, and A. Q. Liu, “An optical neural chip for implementing complex-valued neural network,” Nat. Commun. 12, 1–11 (2021). [CrossRef]
12. M. Reck, A. Zeilinger, H. J. Bernstein, and P. Bertani, “Experimental realization of any discrete unitary operator,” Phys. Rev. Lett. 73(1), 58–61 (1994). [CrossRef]
13. W. R. Clements, P. C. Humphreys, B. J. Metcalf, W. S. Kolthammer, and I. A. Walmsley, “Optimal design for universal multiport interferometers,” Optica 3(12), 1460–1465 (2016). [CrossRef]
14. S. A. Fldzhyan, M. Y. Saygin, and S. P. Kulik, “Optimal design of error-tolerant reprogrammable multiport interferometers,” Opt. Lett. 45(9), 2632–2635 (2020). [CrossRef]
15. M. Y. Saygin, I. Kondratyev, I. Dyakonov, S. Mironov, S. Straupe, and S. Kulik, “Robust architecture for programmable universal unitaries,” Phys. Rev. Lett. 124(1), 010501 (2020). [CrossRef]
16. S. Pai, B. Bartlett, O. Solgaard, and D. A. Miller, “Matrix optimization on universal unitary photonic devices,” Phys. Rev. Appl. 11(6), 064044 (2019). [CrossRef]
17. T. W. Hughes, M. Minkov, Y. Shi, and S. Fan, “Training of photonic neural networks through in situ backpropagation and gradient measurement,” Optica 5(7), 864–871 (2018). [CrossRef]
18. S. Kuzmin, I. Dyakonov, and S. Kulik, “Architecture agnostic algorithm for reconfigurable optical interferometer programming,” Opt. Express 29(23), 38429–38440 (2021). [CrossRef]
19. P. L. Mennea, W. R. Clements, D. H. Smith, J. C. Gates, B. J. Metcalf, R. H. S. Bannerman, R. Burgwal, J. J. Renema, W. S. Kolthammer, I. A. Walmsley, and P. G. R. Smith, “Modular linear optical circuits,” Optica 5(9), 1087–1090 (2018). [CrossRef]
20. S. Rahimi-Keshari, M. A. Broome, R. Fickler, A. Fedrizzi, T. C. Ralph, and A. G. White, “Direct characterization of linear-optical networks,” Opt. Express 21(11), 13450–13458 (2013). [CrossRef]
21. K. Katamadze, G. Avosopiants, A. Romanova, Y. I. Bogdanov, and S. Kulik, “Linear optical circuits characterization by means of thermal field correlation measurement,” Laser Phys. Lett. 18(7), 075201 (2021). [CrossRef]
22. A. Laing and J. L. O’Brien, “Super-stable tomography of any linear optical device,” arXiv, arXiv:1208.2868 (2012). [CrossRef]
23. A. Peruzzo, A. Laing, A. Politi, T. Rudolph, and J. L. O’brien, “Multimode quantum interference of photons in multiport integrated devices,” Nat. Commun. 2(1), 224 (2011). [CrossRef]
24. K. V. Jacob, A. E. Mirasola, S. Adhikari, and J. P. Dowling, “Direct characterization of linear and quadratically nonlinear optical systems,” Phys. Rev. A 98(5), 052327 (2018). [CrossRef]
25. B. Bantysh, “The package for simulating and training integrated linear optical devices,” https://github.com/PQCLab/ILOptics.
26. A. Gilchrist, N. K. Langford, and M. A. Nielsen, “Distance measures to compare real and ideal quantum processes,” Phys. Rev. A 71(6), 062310 (2005). [CrossRef]
27. B. I. Bantysh, A. Y. Chernyavskiy, and Y. I. Bogdanov, “Quantum tomography benchmarking,” Quantum Inf. Process. 20(10), 339 (2021). [CrossRef]
28. J. W. Demmel, Applied numerical linear algebra (Society for Industrial and Applied Mathematics, 1997).