## Abstract

The low-latency requirements of a practical loophole-free Bell test preclude time-consuming post-processing steps that are often used to improve the statistical quality of a physical random number generator (RNG). Here we demonstrate a post-processing-free RNG that produces a random bit within 2.4(2) ns of an input trigger. We use weak feedback to eliminate long-term drift, resulting in 24 hour operation with output that is statistically indistinguishable from a Bernoulli process. We quantify the impact of the feedback on the predictability of the output as less than $6.4\times {10}^{-7}$ and demonstrate the utility of the Allan variance as a tool for characterizing non-idealities in RNGs.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Loophole-free Bell tests [1] have provided conclusive resolution to some of the most fundamental questions on the nature of reality and are now being applied to practical applications such as sources of certifiable randomness [2] and device-independent quantum key distribution [3]. A loophole-free Bell test is a statistical measurement of a correlation parameter observed by space-like separated parties making independent random measurements on correlated particles. Parameter values outside a well-defined bound indicate correlations that cannot be described by any locally realistic model [4]. The first Bell tests were performed by Freedman and Clauser in 1972 [5], Fry and Thompson in 1976 [6], and Aspect *et al.* in 1982 [7–9]. However, due to technological constraints additional assumptions, or loopholes, had to be made for the results to be interpreted as being incompatible with local realism.

The system presented here was used to help close both the locality and freedom-of-choice loopholes in a loophole-free Bell test [1]. The freedom-of-choice loophole requires that Alice and Bob’s measurement settings be chosen randomly and independent of each other, a task suitable for any number of random number generators whose bits are chosen by an understood and trusted mechanism. However, the locality loophole adds the additional requirement that no information about one parties’ measurement be accessible to the other party before the measurements are complete. For local hidden-variable theories, this means that for each trial of the Bell test the choice of measurement settings and the measurements themselves must be completed before a signal traveling at the speed of light could reach the other party. This is achieved by space-like separating the two parties; the loophole-free Bell test reported in [1] used distances of 184.9(1) meters between Alice and Bob [1], opening, in principle, a window of 616.3(3) nanoseconds during which the measurement settings must be chosen, applied, and the measurements completed. However, accounting for practical limitations (fiber propagation delays, device latencies, etc.) the actual duration of the mutually space-like-separated window was just 138 ns. Having a measurement window as long as possible, and a randomness generation process as short as possible improves the performance of the Bell test, and, in the context of certifiable randomness generation, improves the bit generation rate.

To achieve ultra-low-latency random bit generation, we developed a photon-sampling random number generator (PSRNG) based on single-photon detection of optical states in the high-loss regime [9]. In this approach the state from which the bit is generated, a laser pulse, is produced on-demand at the arrival of a trigger, and the propagation path lengths from the source, through the attenuators to the detector, and the path of the detector signal itself, are all kept to a minimum. As a result, we achieve a random-number-generation latency of 2.4(2) ns, defined as the time between the arrival of the triggering signal and the availability of the random bit at the output. Other recent RNGs designed specifically for loophole-free Bell tests have been reported [10,11] with latencies of 10.54(53) ns and 9.8(2) ns.

Many RNGs have significant bias in their raw output and require post-processing algorithms to achieve an unbiased output. Such post-processing necessarily adds to the time between the initiation of the random process and the delivery of an unbiased bit. Eliminating the need for any post-processing allows for the prompt delivery of random bits, but it leaves the system susceptible to changes in uncontrolled parameters that can result in a slow, long-term drift in the ratio of zeros to ones output by the RNG. Such drifts are common in the raw output of physical RNGs and are typically addressed with techniques such as hash functions [12], or sequential applications of exclusive-or (XOR) gates [10]. Instead of post-processing, and the attendant latency it adds, we have developed a feedback system that keeps the bit bias close to the ideal over arbitrary times. By employing clock stability and frequency analysis techniques, we identify the time scale on which the free-running PSRNG deviates from a Bernoulli (binary random) process with equal probabilities of 0 and 1, and we developed the feedback based on this understanding. In this manner we can eliminate drift and maintain highly balanced low-latency random bit generation over arbitrary time scales. The fact that this PSRNG is a controlled Bernoulli process rather than a truly free-running process means that with sufficient statistics the effects of the feedback control will be evident in the observed variance. However, we show that the strength of control required to eliminate drift is so low that an output sample 24 hours long (at a rate of $\approx 1\times {10}^{5}$ bits/s) is indistinguishable from an unbiased process and passes the National Institute of Standards and Technology (NIST) SP 800-22 random number test suite [13] without any post-processing. Such performance means that our PSRNG can serve as an ultra-fast “coin flip” for measurement-basis selection, maximizing the trial window length in loophole-free Bell tests.

## 2. Methodology

The PSRNG, shown in Fig. 1, begins the generation of each random bit on receipt of a trigger, allowing for a precise bound to be placed on the time between when a bit is requested and when it is generated. Upon the trigger’s arrival, a pulse-forming circuit consisting of a toggle flip-flop (TFF) and an AND gate with length-mismatched inputs produces a 950 ps electrical pulse that is amplified and used to drive a high-speed 850 nm vertical-cavity surface-emitting laser (VCSEL). The VCSEL generates a 300 ps optical pulse that is collimated, attenuated by $4.3\times {10}^{6}$, and focused onto the active area of a single-photon avalanche diode (SPAD) module. To reduce latency, avalanche signals are discriminated by a dedicated high-speed comparator (cmp) inserted within the circuitry of the SPAD module, and output bit values are determined from whether or not a detection event occurs in an interval whose beginning and end is defined by clocking two fast D-type flip-flops (DFFs).

The length of the detection interval is set to 750 ps by the difference of two electrical path lengths. Delay_{1} sets the beginning of the detection interval to the start of the measured photon-detection distribution, as indicated in Fig. 2. The end of the detection interval is determined by delay_{2}, at which time the readout DFF is clocked, sampling the state of the SPAD [11]. If a detection occurred, a logical-1 is recorded, otherwise a logical-0 is recorded. The setup and hold times of the flip-flops are small, and we have tuned delay_{2} such that the position of the clock edge is on the tail end of the distribution of photon arrival times registered by the SPAD, as shown in Fig. 2. This cutoff was chosen to roughly balance the influence of the clock’s arrival time on the bit bias while retaining low latency. We note that this balance could be adjusted depending on the relative importance of latency and bit bias in the ultimate application.

SPADs exhibit dark counts and have some recovery time after each detection event during which they are inactive [14]. This imposes a unique requirement on single-photon-detector-based RNGs for low-latency bit generation in a loophole-free Bell test. If a dark count has occurred within one recovery time prior to when a bit is requested, then the resulting bit value is not guaranteed to be causally isolated from the other Bell-test party. The low-latency RNG reported in [11] generates randomness in a similar manner to the one presented here. That system reduces the probability of these non-low-latency events and suppresses the influence of afterpulsing by XORing the output bits of two separate systems. In our case, we identify all instances when the SPAD was inactive at the time of a bit request by means of “not-ready” DFF_{1} and DFF_{2}. The operation of DFF_{1} is asynchronous to that of the trigger; upon being clocked by *any* detected avalanche, its output will remain high for a time determined by delay 3, until it is reset. Delay 3 is tuned to slightly longer than the 55(1) ns recovery time of the detector. If a detection occurs immediately before the input trigger, the output of not-ready DFF_{1} will be high when sampled after delay_{1}, and the NR output will indicate logical-1 to indicate that the SPAD was not armed and ready when a bit was requested. Due to the low dark count rate of the SPAD, not-ready events occur at a rate of ≈${10}^{-5}$ per trigger.

Afterpulsing, or an increase in the background noise correlated with a prior detection event, could induce correlations of logical-1s in the PSRNG if the time between triggers is short relative to the timescale over which afterpulsing persists [11]. For the SPAD used in this work, typical 1/*e* afterpulsing times are < 50 ns, and when operating at 100 kHz with a 750 ps detection interval, there is no discernable correlation of logical-1s in 24 hours of data acquisition (see below).

To monitor the PSRNG’s performance and to calculate the feedback signal, the state of the RND and NR outputs are monitored by a microcontroller. A high-speed digital fanout makes a copy of the input trigger that is routed to the microcontroller, which initiates an interrupt service routine that records the value of the RND and NR bits. If the NR signal is high, the RND bit is discarded. After each 100 000 valid bits (≈1 s), the number of observed logical-1s, ${n}_{1}$, is used to make an estimate of the PSRNG’s bit bias $p$, where $p={n}_{1}/(100000).$The microcontroller also records various other measurements including temperature, DAC register status, and NR rate, which are transmitted over a serial port for real-time monitoring.

Under feedback, the amplitude of the optical state, and thus the bit bias $p$, is tuned by adjusting the forward voltage across the VCSEL. The voltage is controlled by the microcontroller via a two-channel 16-bit digital-to-analog convertor (DAC) and a summing amplifier. The DAC channel outputs are separately divided down to form coarse and fine-adjustment voltages *v*_{c} and *v*_{f}, and are added to a DC voltage *v*_{a} of 4.3 V. The output of the summing amplifier biases the VCSEL’s anode, and the output of the RF amplifier (located on the cathode) is also biased at 4.3 V. During the period between triggers, the forward voltage across the VCSEL is < 300 mV, low enough that after optical attenuation, spontaneous emission makes no measurable contribution to the background count rate of the SPAD. The DAC channels are divided down such that 1 least significant bit (LSB) in the coarse channel corresponds to a change in $p$ of $5.15\times {10}^{-7},$ and 1 LSB in the fine channel corresponds to a change of $5.32\times {10}^{-8}$. The PSRNG’s measured bit bias while the DAC is varied over its full range is shown in Fig. 3.

With feedback disabled the output bit bias can slowly drift enough to render the system unusable for high-quality random bit generation. The causes are identified as largely thermal, and to minimize their effect the system is mounted on an aluminum plate and enclosed in an insulated acrylic box. An active temperature controller and Peltier stage are used to control the temperature of the aluminum plate, with the controller’s temperature monitor positioned at the location of the pulse-forming circuit, the region we have determined to be the most sensitive to temperature changes. In addition, low temperature-coefficient components have been used where possible. Thermal control alone reduces the free-running PSRNG’s bit bias drift to ≈$2\times {10}^{-4}$ per day under typical laboratory conditions.

## 3. Drift characterization

It is likely that the overall instability observed in the free-running PSRNG is a combination of both known (i.e., thermal and 1/*f* noise) and unknown causes. To assist in identifying the composition of the drift, we use the Allan variance [15].

Unlike the standard variance, which is referenced to a sample’s mean and diverges for most types of drift, the Allan variance is referenced to adjacent measurements, and thus is useful for analyzing the character of long-term drift. The slope of the Allan variance versus measurement interval depends on the noise type, making it a useful tool for diagnosing noise sources. The Allan variance is one half of the average squared difference between successive measurements made at a sampling interval $\tau $;

where *y*_{i} is the *i*^{th} of *M* measured values averaged over the sampling interval, *τ*. It is conventional to report the Allan deviation, ${\sigma}_{y}(\tau )$, which is simply the square-root of the Allan variance. For normally distributed white noise, the Allan deviation exhibits a ${\tau}^{-1/2}$ dependence on the sampling interval. Other types of noise behave differently: 1/*f* noise goes like ${\tau}^{0}$, and random-walk frequency noise goes like ${\tau}^{1}$. This behavior assists in identifying the composition of a system’s overall output, and the strength and timescales over which the various types of noise are dominant. We used this technique to identify the types of noise present in our system and to develop an effective feedback protocol.

The measurement value *y _{i}* in Eq. (1) is the average of $\tau $ measurements of the bit bias $p$, where $p$ is the average of 100 000 output bits, as described above. In our PSRNG, the amount of time it takes to produce 100 000 valid low-latency bits varies slightly due to triggers that arrive when the system is “not-ready.” Thus, it is it worthwhile to highlight the following subtle distinction: traditionally the Allan sampling interval is measured in time, but in our case the sampling interval $\tau $ is measured in numbers of measurements of $p$, each of which takes about one second when triggered at ${10}^{5}$ Hz.

To characterize the drift in the PSRNG we disabled the feedback and acquired 96 hours of random output. Figure 4 shows a plot of the Allan deviation for this data set. The red dashed line is the behavior expected for a white noise source, while the solid blue line is the output of the PSRNG with feedback disabled. Without feedback, the PSRNG displays white noise characteristics out to a sampling interval of ${10}^{3}$ (approximately ${10}^{3}$ s).

To ensure that there are no overlooked deviations in the region $\tau <{10}^{3}$ that might have been averaged out, we used an extension of the Allan deviation, the dynamic Allan deviation (DADEV) to analyze the $\tau <{10}^{3}$ region. The DADEV is intended to characterize systems that exhibit levels of instability that may vary in time and is calculated by splitting the full data set into smaller windows, calculating the Allan deviation of each, and comparing their behavior [16,17]. The DADEV of the 96 hour data set, separated into 44 windows of length 4 hours, and overlapping by 2 hours, is shown in Fig. 5. No short-term instabilities can be seen, indicating that without feedback the PSRNG’s output is well modeled as random white noise for sampling intervals less than ${10}^{3}$ (≈${10}^{3}$ s).

Without feedback, at longer averaging periods ($\tau >{10}^{3}$) significant deviation begins to occur between the measured ${\sigma}_{y}\left(\tau \right)$ and a white noise process. The ${\tau}^{0}$ and ${\tau}^{1}$ characters of the Allan deviation beyond ${10}^{3}$ in Fig. 4 suggest that the non-white noise includes a combination of flicker and random-walk type frequency noise [15]. We find that the observed Allan deviation from $\tau ={10}^{0}$ to $\tau ={10}^{5}$ can be reasonably well fit (*R*^{2} = 0.9829) by modeling the PSRNG’s output as the sum of an ideal white noise process and a single random walk process of the form ${X}_{t}=\rho {X}_{t-1}+\epsilon ,$ where $\rho =1$, and *ε* is a white noise process with mean *µ* = 0 and variance ${\sigma}_{\epsilon}^{2}=2.8\left(1\right)\times {10}^{-13}.$

## 4. Feedback

Based on the Allan deviation analysis above, we applied the feedback model illustrated in Fig. 6 to the PSRNG, where the output is the sum of an ideal white noise process *w(t)* and a random walk *o(t)*. The bit bias $p$ of the system is measured by the microcontroller, and with an appropriately designed filter *f(t)* we can estimate the magnitude of *o(t)* by monitoring the deviation from the expected behavior of an ideal white noise process. The mean of the white noise process is controlled by adjusting the VCSEL’s voltage bias, thereby tuning the probability of detection to eliminate *o(t),* leaving a random signal well approximated by white noise.

When determining the type of digital filter to use, we note that the majority of non-white components reside at frequencies less than ${f}_{c}\approx 2x{10}^{-4}$ Hz, as revealed by a discrete Fourier transform (DFT) of measured bit biases and shown in Fig. 7. This suggests that *f(t)* be of the low-pass type, with a stop band at frequencies higher than ${f}_{c}$. Given the constraints imposed by both the bit generation rate and the microcontroller, we implemented *f(t)* as a simple moving-average filter.

Every ${10}^{5}$ valid bits (≈1 s), the estimated drift of the system is evaluated by subtracting the number of recorded logical-1 bits, ${n}_{1}\left(t\right)$, from the expected mean value of $5\times {10}^{4}$ bits. These values are inserted into an *M*-element array, where *M* is the length of the averaging window. For each new value of ${n}_{1}\left(t\right)$ an error term $e\left(t\right)$ is calculated by summing the array’s elements, and multiplying by a gain parameter *g.* The error term therefore takes the form

For simplicity we have chosen *g* to encapsulate both the 1/*M* averaging factor and the scaling for the bit resolution of the DAC. The *M*-element sum can be calculated recursively by adding the most recent measure of ${n}_{1}\left(t\right)$, while subtracting the oldest $({n}_{1}\left(t-M\right))$. This fast operation allows the microcontroller to apply the filter efficiently and without interrupting other processes. The choice of *M* and *g* depend on the strength and time scales of the observed drift, both of which are estimated from the Allan deviation of the free-running system.

To estimate an appropriate length for the averaging window, we compare how the variances of the two processes scale with averaging length. From the Allan deviation we estimated the additional noise to be a random walk in $p$ with variance ${\sigma}_{\text{rw}}^{2}=2.8\times {10}^{-13}$. Assuming a perfectly balanced Bernoulli process, the variance of ${n}_{1}\left(t\right)$ would be $2.5\times {10}^{4}$ counts, which gives a variance in $p={n}_{1}(t)/{10}^{5}$ of ${\sigma}_{\text{wn}}^{2}=2.5\times {10}^{-6}$, since $Var\left(aX\right)={a}^{2}Var\left(X\right)$. When averaged over $M$ measurements the variance of a white noise process scales as $1/M$, while the variance of a random walk process scales as $M$. Given the values above, we find the observed variances of the two processes become equal at an averaging length of 2989 measurements of ${n}_{1}(t)$, or ≈50 minutes.

Although averaging up to 2989 samples increases confidence in $p$, during this time the system is also drifting. The expected travel of a random walk process with variance ${\sigma}^{2}$ approaches $\sqrt{2M{\sigma}^{2}/\pi}$ for large *M*, which for *M* = 512 and the random walk estimated from the Allan variance, corresponds to a change in $p$ of $0.995\times {10}^{-6}$, or 0.995 bits. We have chosen to attempt to keep the output balanced to within 1/(50 000) of the ideal 0.5 (a mean of 50 000(1) logical-1s in 100 000 bits). Numerical simulations show that performance is relatively insensitive to small changes in *M*, with window sizes ranging from 200 to 2000 samples having residuals within 10% of each other. Therefore, to keep the amount of uncorrected drift low and allow for faster-indexing arithmetic in the microcontroller we chose *M* to be 512.

To determine the feedback gain $g$, we first note that this type of feedback has a critical gain value ${g}_{\text{c}}$, above which the system becomes unstable and oscillates. Thus ${g}_{\text{c}}$ serves as an upper bound for the feedback gain. To determine ${g}_{\text{c}}$ for a given $M$ we examine the recurrence relation of the averaging filter,

which has a characteristic formRecurrence relations are stable if and only if their eigenvalues, or roots of their characteristic equation, all have magnitude less than 1 [18]. Therefore, calculating ${g}_{\text{c}}$ is equivalent to factoring this ${M}^{th}$-order polynomial and determining the maximum value of $g$ for which all roots are within the unit circle. We have solved for these roots numerically and found that the critical gain for an $M$-length averaging window takes the form ${g}_{\text{c}}=1-\text{cos}(\frac{\pi}{M})$, which for large *M* can be approximated by ${g}_{\text{c}}\approx 5/{M}^{2}$, and places an upper bound on ${g}_{\text{c}}$ of ${g}_{\text{c}}(M=512)<1.88\times {10}^{-5}$.

To further optimize the gain, we again turn to the Allan deviation. Applying too little feedback will underestimate the random-walk component and allow the system to drift, resulting in an increase in the Allan deviation at longer averaging periods. Similarly, too strong a feedback signal will cause an overcorrection towards the desired setpoint, reducing the probability of the more extreme values located in the tails of the ideal probability distribution, resulting in a lower-than-expected Allan deviation at longer averaging periods. Combined with the averaging filter’s ripple-like stopband performance, overestimating the drift also results in the addition of higher frequency components, and a larger-than-expected variance at averaging periods close to integer multiples of *M*.

Using ${g}_{\text{c}}$ as a starting point, we numerically simulate applying averaging-filter feedback to the uncontrolled data set used in Fig. 4 and calculate the Allan deviation of the result. We find that the residual difference between the resulting Allan deviation and that of an ideal white noise process exhibits a unique minimum at a gain value of ${g}_{\text{min}}=0.32/\left({512}^{2}\right)$, as shown in Fig. 8. We use this value for the feedback gain.

## 5. Entropy loss

The feedback is based on past events, and therefore its contribution to the current state of the RNG must be considered accessible to all prior to the next output. To estimate the amount of information about the PSRNG’s next output bit that can be gained from knowing its prior history, we assume that $p$ ≈0.5, and each measurement of ${n}_{1}(t)$ will belong to a binomial distribution with standard deviation $\sigma =\sqrt{np(1-p)}$ = 158.1 bits. The standard deviation of the sum inside the feedback window of length *M* = 512 is $\sqrt{M}\sigma =$3577.7 bits, and when multiplied by the optimum gain factor ${g}_{\text{min}}$ results in an applied feedback value of 0.0043 bits, or an adjustment to $p<4.3\times {10}^{-8}$ for a prior output history over the averaging window that is within one standard deviation of the mean. This shows that the per-second adjustments made by the feedback system are typically very small.

If we consider the extreme case of the feedback being calculated from an output sequence that is 5σ from the mean, then $\Delta p$ = $2.15\times {10}^{-7}$. With knowledge of this anomalous event an attacker trying to predict the subsequent output would gain an advantage equal to the excess predictability of the next bit. An ideal random bit has predictability $0.5$, but a partially random bit has an excess predictability equal to $2\times \mathrm{max}\left[p\left(0\right),p\left(1\right)\right]-1$. For the extreme 5*σ*-value of $\Delta p$, this results in an excess predictability of $4.3\times {10}^{-7}$, a very small advantage [10]. We note that for our system, a 5*σ* event will occur, on average, once every 38.6 days.

Empirically, we confirmed the behavior of the feedback by recording the DAC register values for a 96 hour data set when the feedback was operational, a histogram of which is shown in Fig. 9. The largest observed feedback signal was a per-sample correction of ± 6 DAC bits, corresponding to $\Delta p=3.19\times {10}^{-7},$ and an excess predictability of $6.4\times {10}^{-7}$. The slight increase over the theoretical estimate is due to the bit resolution of the DAC being larger than the estimated average $\Delta p$.

## 6. Results

With feedback enabled, another data set of over 96 hours was taken. The measured ${\sigma}_{y}\left(\tau \right),$ along with the free running and expected results from simulation, are shown in Fig. 10. In comparison to the free-running data set, the difference between the feedback-controlled system and an ideal white noise process has been greatly reduced; ${\sigma}_{y}\left(\tau \right),$ closely matches white noise behavior out to averaging periods of 10^{5}, or approximately 27.6 hours. As can be seen in the residual difference between the two, the poor stop-band performance of the moving-average filter slightly raises the variance at shorter intervals $({10}^{2}<\tau <{10}^{3})$; this effect could be reduced somewhat by designing a higher-order filter. Figure 11 shows the observed bit biases with and without feedback.

The feedback-controlled PSRNG’s output was evaluated with the NIST statistical test suite for random numbers [13]. When split into 24 hours data sets the feedback-controlled output consistently passed all tests. When analyzed as a single 96 hour data set, the output narrowly fails. This is not unexpected, because with enough statistics the presence of feedback can be detected, as the feedback suppresses certain long-term behaviors that an ideal RNG would exhibit.

To visualize the timescales over which the PSRNG’s output is indistinguishable from a pure random source, we compare the results of a particular test in the NIST test suite, the frequency test, for different time periods. At each iteration, 100 $N$-second blocks are extracted from each of the 96 hour data sets, with individual blocks being uniformly spaced to avoid testing only a small interval of data. For this test, a minimum 96 of the 100 tests must pass for the sequence to be considered random. As shown in Fig. 12, the behavior of the PSRNG without feedback almost immediately falls below this bound, while the application of feedback enabled the statistics to appear random until block sizes reach ≈2300 seconds. The period required to gather 100 blocks of this size is ≈64 hours.

The study of our relatively low-bitrate system highlights a critical deficiency that may be overlooked in the statistical testing of other physical random number generators. The 100 kHz rate of the PSRNG is several orders of magnitude lower than many modern RNG systems; for our system a 96 hour data set produces only 34.56 Gb, and processing this data set with the NIST test suite took over three days on a standard desktop computer. For systems with bit generation rates of hundreds of gigabits per second, a 96 hour data set would result in many petabytes, a size infeasible to test without specialized computing resources. Often, RNGs operating at high rates only test several gigabytes of data [19], which translates into ≈1 ms of operation. Such a brief sample is insensitive to the long-term drifts that are typical in many electronic systems. Conclusions drawn from such brief tests include significant assumptions about the system’s time (in)dependence. For many applications, particularly those involving security, those assumptions open potentially fatal issues, thus a more exhaustive testing approach that examines a system’s long-term behavior is required. For RNGs with high bit rates, this could be accomplished by only storing the per-second (or longer) count totals and executing the test suites, or Allan deviation analysis, on the compressed bit totals.

## 7. Conclusions

We have presented an on-demand, ultra-low latency random number generator capable of generating a bit only 2.4 (2) ns after request, and with excess predictability of just $6.4\times {10}^{-7}$. This type of specialized RNG is required for applications such as a loophole-free Bell test, which requires stringent bounds on the temporal extent of the bit generation process. Free running, the system exhibits drift that would require frequent adjustment to maintain an unbiased output, but with the addition of a feedback method we can eliminate the drift to such a degree that 24 hours of continuous output can pass the NIST RNG test suite.

While the moving average filter used to identify the drift was not optimal, its performance was sufficient for this initial exploration. Other filter types may be explored in the future. Where low-latency is required, reducing as much of the noise as possible through passive means is preferred over active methods, since stronger feedback results in a greater excess predictability through knowledge of the feedback signal.

As demonstrated above, the Allan variance is a particularly useful tool for quantifying an RNG’s time-domain stability and characterizing any non-idealities it may have. We find that Fourier analysis is useful to coarsely identify the relevant bandwidth over which feedback is needed (in our case the non-white frequency components largely reside below $2\times {10}^{-4}$ Hz), but the Allan variance was useful for bounding and fine tuning the strength of the feedback. The Allan variance has been used previously to analyze ring-oscillator-based RNGs, where it revealed an unknown random-walk-like behavior [20]. Finally, we find that some type of long-timescale characterization is important for RNGs that are put into continuous use, and this may be an overlooked, but nonetheless critical shortcoming of ultra-high speed RNGs for which the high output rate makes acquiring and testing long-term data sets particularly challenging.

The system has been packaged in a small enclosure with its own power supply, temperature controller, USB interface, and user display. Two copies of the system are being used in further Bell-test experiments, and another is being incorporated into the NIST Randomness Beacon, which is a public service providing time-stamped and digitally signed random bit strings [21].

## References

**1. **L. K. Shalm, E. Meyer-Scott, B. G. Christensen, P. Bierhorst, M. A. Wayne, M. J. Stevens, T. Gerrits, S. Glancy, D. R. Hamel, M. S. Allman, K. J. Coakley, S. D. Dyer, C. Hodge, A. E. Lita, V. B. Verma, C. Lambrocco, E. Tortorici, A. L. Migdall, Y. Zhang, D. R. Kumor, W. H. Farr, F. Marsili, M. D. Shaw, J. A. Stern, C. Abellán, W. Amaya, V. Pruneri, T. Jennewein, M. W. Mitchell, P. G. Kwiat, J. C. Bienfang, R. P. Mirin, E. Knill, and S. W. Nam, “Strong loophole-free test of local realism,” Phys. Rev. Lett. **115**(25), 250402 (2015). [CrossRef] [PubMed]

**2. **P. Bierhorst, E. Knill, S. Glancy, Y. Zhang, A. Mink, S. Jordan, A. Rommal, Y.-K. Liu, B. Christensen, S. W. Nam, M. J. Stevens, and L. K. Shalm, “Experimentally generated randomness certified by the impossibility of superluminal signals,” Nature Lett. **556**(7700), 223–226 (2018). [CrossRef] [PubMed]

**3. **H.-K. Lo, M. Curty, and B. Qi, “Measurement-device-independent quantum key distribution,” Phys. Rev. Lett. **108**(13), 130503 (2012). [CrossRef] [PubMed]

**4. **J. Bell, “On the Einstein Podolsky Rosen paradox,” Physics **1**(3), 195–200 (1964). [CrossRef]

**5. **S. J. Freedman and J. F. Clauser, “Experimental test of local hidden-variable theories,” Phys. Rev. Lett. **28**(14), 938–941 (1972). [CrossRef]

**6. **E. Fry and R. Thompson, “Experimental test of local hidden-variable theories,” Phys. Rev. Lett. **37**(8), 465–468 (1976). [CrossRef]

**7. **A. Aspect, P. Grangier, and G. Roger, “Experimental tests of realistic local theories via Bell’s theorem,” Phys. Rev. Lett. **47**(7), 460–463 (1981). [CrossRef]

**8. **A. Aspect, P. Grangier, and G. Roger, “Experimental realization of Einstein-Podolsky-Rosen-Bohm gedankenexperiment: A new violation of Bell’s inequalities,” Phys. Rev. Lett. **49**(2), 91–94 (1982). [CrossRef]

**9. **A. Aspect, J. Dalibard, and G. Roger, “Experimental test of Bell’s inequalities using time-varying analyzers,” Phys. Rev. Lett. **49**(25), 1804–1807 (1982). [CrossRef]

**10. **C. Abellán, W. Amaya, D. Mitrani, V. Pruneri, and M. W. Mitchell, “Generation of fresh and pure random numbers for loophole-free Bell tests,” Phys. Rev. Lett. **115**(25), 250403 (2015). [CrossRef] [PubMed]

**11. **M. Stipčević and R. Ursin, “An on-demand optical quantum random number generator with in-future action and ultra-fast response,” Sci. Rep. **5**(1), 10214 (2015). [CrossRef] [PubMed]

**12. **M. A. Wayne, E. Jeffrey, G. M. Akselrod, and P. G. Kwiat, “Photon arrival time quantum random number generation,” J. Mod. Opt. **56**(4), 516–522 (2009). [CrossRef]

**13. **A. Rukhin, J. Soto, J. Nechvatal, M. Smid, E. Barker, S. Leigh, M. Levenson, M. Vangel, D. Banks, A. Heckert, J. Dray, and S. Vo, “A statistical test suite for random and pseudorandom number generators for cryptographic applications,” NIST Spec. Publ. 800–22 (2010).

**14. **A. Migdall, S. V. Polyakov, J. Fan, and J. C. Bienfang, eds., *Single-photon generation and detection: Physics and applications* Experimental Methods in the Physical Sciences (Academic, 2013), 45.

**15. **W. J. Riley, *Handbook of frequency stability analysis* NIST Special Publication No. 1065 (2008).

**16. **L. Galleani and P. Tavella, “The dynamic Allan variance,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control **56**(3), 450–464 (2009). [CrossRef] [PubMed]

**17. **L. Galleani, “The dynamic Allan variance III: confidence and detection surfaces,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control **58**(8), 1550–1558 (2011). [CrossRef] [PubMed]

**18. **W. Press and S. Teukolsky, *Numerical recipies 3rd edition: The art of scientific computing* (Cambridge University Press, 2007).

**19. **L. Zhang, B. Pan, G. Chen, L. Guo, D. Lu, L. Zhao, and W. Wang, “640-Gbit/s fast physical random number generation using a broadband chaotic semiconductor laser,” Sci. Rep. **7**(1), 45900 (2017). [CrossRef] [PubMed]

**20. **Microsemi, “Truth in randomness: Practical insights on randomness, the nature of the universe, and using ring oscillators as entropy sources for high-security applications,” (2001).

**21. **“NIST randomness beacon,” https://beacon.nist.gov/home (n.d.).