Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Deriving and dissecting an equally bright reference boundary

Open Access Open Access

Abstract

The Helmholtz-Kohlrausch effect signifies the discrepancy between brightness as a perceptual attribute and luminance as a physical metric across different chromaticities. Based on the concepts of brilliance and zero grayness proposed by Ralph Evans, equally bright colors were collected in Experiment 1 by asking observers to adjust the luminance for a given chromaticity to the glowing threshold. The Helmholtz-Kohlrausch effect is thus automatically incorporated. Similar to the diffuse white as a singular point along the luminance dimension, this reference boundary demarcates surface colors from illuminant colors and correlates with the MacAdam optimal colors, which provides not only an ecologically relevant basis but also a computational handle for interpolating to other chromaticities. By navigating across the MacAdam optimal color surface, the contributions of saturation and hue to the Helmholtz-Kohlrausch effect were further quantified via saturation scaling in Experiment 2. The implications of our findings for brightness modeling, color dimensions, and potential applications are discussed.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

1.1 Chromatic brightness/lightness model

Among the three fundamental attributes of color, i.e., hue, brightness, and saturation, brightness is so important that candela ($cd$) is the only unit in the International System of Units that is not totally independent of human perception. The standardization of photometry for quantifying brightness was marked with the luminous efficiency function $V(\lambda )$, which was established by using the psychophysical approach of heterochromatic flicker photometry. The resulting $V(\lambda )$ has a useful additive property (Abney’s law), under which any given spectral radiance can be linearly integrated to a luminance value with the unit of $cd/m^2$. In addition, the widely used lightness metric CIE $L^{*}$ (in CIELAB and CIELUV) is a function of the luminance of the target stimulus with a normalization relative to the “white”. However, it has been found that $V(\lambda )$ works well only for specific stimulus conditions where it was derived, i.e., high temporal and low spatial frequency [1,2]. The minimization of flicker perception in heterochromatic flicker photometry, or equal luminance, was found to correlate with other perceptual judgments such as minimally distinct borders/contours and disappearance of shadow perception [3,4]. A more natural psychophysics scheme, direct heterochromatic brightness matching by judging the brightness of two colors side by side, did not lead to a $V_b(\lambda )$ function with additivity property [5] and systematically consistent results across observers [6]. The physiological substrates for the dichotomy between luminance and brightness have been investigated. Luminance corresponds to the activations of L- and M-cones and the magnocellular system [1,3], whereas brightness seems to involve more of mid- and high-level perceptions [2] and the primary visual cortex (V1) [7].

The discrepancy between luminance and perceived brightness is also phenomenologically noted in the Helmholtz-Kohlrausch (H-K) effect; more saturated colors appear to be brighter with a hue dependency [8]. In other words, brightness, hue, and saturation are not independent within current color appearance models; when the luminances of chromatic stimuli are held constant to a neutral reference, the perceived brightness partly comes from their chromatic components. To develop a color appearance model with more independence or less interference between color attributes, incorporating the H-K effect is an essential first step [9,10]. To quantify the H-K effect, equations have been proposed for unrelated colors (Ware & Cowan system) and related colors [11,12]. However, according to a recent review on brightness modeling, none of the state-of-the-art models provide satisfactory results for the H-K effect [13]. The authors of [13] also adopted an H-K term in the chromatic component, i.e., colorfulness, to model the H-K effect found in their psychophysical data for self-luminous colors.

1.2 Brilliance and $G_0$ functions

A different approach to modeling chromatic colors’ brightness is to combine (in a colorimetric sense) brightness and saturation into a new perceptual attribute, “brilliance” coined by Evans. In his experiment [14], a series of monochromatic stimuli were centered with a neutral background, and the observer (Evans himself) adjusted the luminance of the center until it appeared in a mode between object color and self-luminous color, a state which Evans called “fluorence” (a perception to be distinguished from the physical “fluorescence” [15]) or equivalently color with zero gray content. Those threshold luminances for each wavelength are their $G_0$ functions. And such a concept can be generalized to the entire chromaticity diagram, as done for the H-K effect [6]. According to Fairchild [8], brilliance is some kind of apparent brightness that automatically incorporates the H-K effect, and $G_0$ defines the luminance of “equal chromatic brightness (really, just brightness)” for various chromaticities.

As mentioned in the specification of $L^{*}$, the difference between brightness and lightness is that brightness is more of an absolute color attribute, whereas lightness is defined by CIE as the “brightness of an area judged relative to the brightness of a similarly illuminated area that appears to be white or highly transmitting.” In other words, only related colors, as opposed to unrelated colors that are perceived in isolation from other colors, have lightness or gray content ($G_0$), and both related and unrelated colors have brightness. In this work, as the adaptation level (or the reference white) in the experiments was expected to be constant, equal lightness also corresponds to equal brightness, so in the following, we choose to use more intuitive “brightness” for the general case, and our focus is related colors.

Although no simple mathematical model of $G_0$ has been developed, its significance was further studied, especially by Nayatani and Heckaman & Fairchild [11,1618]. Speigle and Brainard have also connected Evans’ work with chromatic adaptation [19] under the framework of estimating an equivalent illuminant and the surface reflectances. The boundary between reflective and self-luminous appearance modes is of great importance for both understanding the ecological aspect of color vision and color reproduction. In the movie industry, when colorists (over-)adjust the brightness/saturation of local objects, especially on a high-dynamic-range & wide-color-gamut (HDR & WCG) display where the H-K effect is more relevant, the colors may appear fluorescent. Such a warning boundary is approximately the $G_0$ gamut boundary [20,21].

1.3 Brightness versus saturation and hue

Given the chromaticities and their corresponding $G_0$ luminances, an equally bright surface (or a 2-D look-up table for practical applications) can be defined. Across this surface, color can be changed in terms of saturation and hue while maintaining brightness. The H-K effect with respect to saturation (or chroma as the luminance is kept constant) is well accepted, whereas the description of the contribution from hue is often less quantitative. One of the exceptions was done by Uchikawa et al. [22] who first collected equally bright colors, via brightness matching and flicker photometry, across different levels of dominant wavelength and excitation purity and then asked observers to judge their saturation. Therefore, with the constant-saturation loci, the factor of dominant wavelength (or hue) was separated and better quantified. Only two of the authors served as observers, and the stimuli had a low luminance level with an adaptation field of $150\,td$.

In this work, the concepts of brilliance and zero grayness were revisited by two psychophysical experiments with stimuli rendered on an HDR & WCG display. Specifically, in Experiment 1, the $G_0$ luminances for a group of representative chromaticities were collected, which were verified as equally bright via paired comparisons and were found to highly correlate with two different physical gamut boundaries including the MacAdam optimal colors. In Experiment 2, the equally bright MacAdam optimal colors were used as stimuli for saturation scaling, and the relations between saturation and physical purity and the H-K effect were analyzed. We describe the details and results of the two experiments in the following sections. In addition, the stimulus background used in Experiment 1 was further investigated by either changing its luminance level or adding a color-checker. The findings and implications from those results are discussed and summarized in the end.

2. Scaling experiments

2.1 General settings and stimulus selection

The protocols of our work received approval from the Human Subjects Research Office at Rochester Institute of Technology. All experiments were conducted in a dark room (with most surfaces covered in black) where the stimuli were presented on an Apple Pro XDR display [23]. The display had a peak luminance of ${\sim }1500\,cd/m^2$ and a color gamut of approximately DCI-P3. The graphical user interface was programmed in Apple’s Swift language and the dynamic range and color gamut/encoding were managed by the Metal API [24]. The characterization accuracy based on a linear model (3-by-3 matrix) and three 10-bit RGB look-up-tables (LUTs) [25] achieved an average of ${\sim }0.57\,\Delta E_{00}$ for randomly sampled colors across the gamut, which was considered adequate for our objectives. The linearity for each RGB encoding channel was guaranteed by checking the relation between single-channel control value and the corresponding measured CIEXYZ values before channel saturation. The display was allowed to warm up and then re-calibrated prior to each data collection session.

Experiment 1 collected the $G_0$ luminances by asking observers to adjust the brightness of a color patch at constant chromaticity, the results of which were found to correlate with the MacAdam optimal colors. In Experiment 2, the (linearly scaled) MacAdam optimal colors served as the equally bright stimuli, and the observers, given two reference colors, adjusted the center patch to be their perceptual midpoint in terms of saturation. The experiments shared similar configurations; Experiment 1 only presented the center patch, whereas Experiment 2 showed a triplet of patches as depicted in Fig. 1. Each square patch covered a 3-by-3 deg field size with 1-deg separations between them, and the rest of the full screen was filled with random neutrals as the background, corresponding to a field size of 39-by-22 deg. The observer was seated in front of the center of the display at a distance of 1 meter and adjusted the seat height to set their eye level to the stimuli level. The background’s lightness levels ranged from $L^{*}$ of $0$ to $100$ at a uniform interval of $25$ so that the average was linearly integrated to $L^{*}$ of $50$ or ${\sim }18\%$ gray [26]. The absolute luminance levels were thus relative to the peak pixel in the background set as $200\,cd/m^2$. The display’s native resolution was 6016-by-3384 and the smallest unit of the random neutrals had 10 pixels. On the top of the screen, there was a text box showing the cue for the corresponding instruction as described in the next section.

 figure: Fig. 1.

Fig. 1. Stimulus configuration in the experiments. Each color patch covered a 3-by-3 deg visual field with 1-deg separations between them. In Experiment 1, only the center patch was shown and adjusted to the $G_0$ level. The triplet of color patches was used in Experiment 2 where the observer adjusted the center patch to be the perceptual midpoint between the two flanking patches in terms of saturation. The background had random neutrals at different lightness levels.

Download Full Size | PDF

Given the display gamut constraint, a group of chromaticities corresponding to representative colors from the Munsell color system were selected. For each hue (5R (red) / Y (yellow) / G (green) / B (blue) / P (purple)), the maximum chroma level within the display gamut was first determined, then three different chroma levels were selected for each hue with the same lightness value. Thus the chromaticities of those 15 Munsell colors plus the perfectly reflective white under CIE D65, plotted in Fig. 2 and listed in Table 1, were used as the stimuli in Experiment 1. Those max-saturated hues were used as the starting references in Experiment 2.

 figure: Fig. 2.

Fig. 2. Stimulus chromaticities versus display gamut. Five Munsell hues at three different levels of chroma plus a neutral D65 were included.

Download Full Size | PDF

Tables Icon

Table 1. Stimulus chromaticities and their Munsell specifications (when at particular relative luminances $Y$).

2.2 Experiment 1: brilliance scaling

2.2.1 Stimuli and observer task

The primary objective of the first experiment was to collect $G_0$ data for those selected chromaticities under a fixed adaptation. After a two-minute adaptation, the observer was asked to “adjust the brightness of the center patch until it just appears to have no grayness, or equivalently just about to glow or cease to glow depending on the starting point”. The concept of grayness was illustrated by first increasing the luminance of the neutral (stimulus #16), which appeared from black to gray, and gradually to white with no grayness and to the glowing state. For other chromaticities as well as the opposite adjustment direction the demonstration was similarly repeated, and the observers were expected to learn to generalize grayness to those non-neutral colors. Twelve observers (7 M & 5 F; average age of 31 with a standard deviation of 8) with normal color vision, as tested by Ishihara plates, completed this experiment. 11 of them, including two of the authors, had a color science background and experience with psychophysical experiments. One naive observer participated and achieved a similar level of repeatability.

The stimulus was constrained to only vary only in luminance while its chromaticities stayed constant, which perceptually corresponds to Evans’s brilliance dimension and was more intuitively worded as “brightness” in the instruction. The observer used a keyboard to adjust the luminance. At each trial, the stimulus started with either the lowest or the highest luminance within the gamut. For the $200\,cd/m^2$ peak background, fine step was $5\,cd/m^2$ and large step was $30\,cd/m^2$. According to post-experiment interviews, the steps provided were not a constraint for the observers. The observer was suggested to “adjust along either increasing or decreasing brightness direction” as much as they could. This constraint was similar to that typically used in the method of limits and adopted here because the adjustment direction might have a temporal effect on the adaptation, which we hoped to average out. Observers were also allowed to reverse back with a step of $90\,cd/m^2$, especially when they passed their $G_0$ thresholds. Both up and down directions were repeated three times thus $16*6=96$ trials, in a random order, were done. A training session covering all chromaticities and both adjustment directions was provided and the observer could practice until they felt confident about the task. There was no time limit for each trial and it took the observers on average about 1 hour to complete.

2.2.2 Results: $G_0$ and intra- & inter-observer variations

In Fig. 3(a), each dot represents the $G_0$ luminance , averaged across six repeats, of the stimulus indexed on the x-axis for each observer. The vertical range reflect the large inter-observer variation. Interestingly, the variation is very similar on the log scale (the maximum to minimum ratio ranges from $3.2$ to $5.0$) across the stimuli. The observer with higher $G_0$ luminance for some chromaticities also set higher luminance for other chromaticities, for example, Observer 1 and 4 as the two extreme cases. Figure 3(b) shows the $G_0$ luminance averaged across all observers, which is consistent with the H-K effect as a function of both hue and chroma. Red and purple have a stronger H-K effect so they require lower luminances to be equally bright. For a given hue, higher saturation appears brighter when iso-luminant to the neutral chromaticity, Stimulus #16, thus needing less luminance at a constant brightness level. Those monotonically increasing curves from high chroma to low chroma are mostly shared in the individual results in Fig. 3(a), especially for red and purple. For yellow, green, and blue, the H-K effect was not clearly found within a couple of observers, such as Observer 2 and 6.

 figure: Fig. 3.

Fig. 3. $G_0$ luminances under $200\,cd/m^2$ peak background. In 3(a), each dot represents one individual observer’s result averaged across six repeats for the stimulus on the x-axis where the index corresponds to 1. Each observer is represented by different marker symbols in the legends. The $G_0$ luminances averaged across the observers are plotted in 3(b). The error bars indicating the inter-observer variations represent the standard deviations of those individual $G_0$ luminances.

Download Full Size | PDF

Table 2 lists the numeric values of the $G_0$ luminance results averaged over both repeats and observers as well as intra-observer variations. For each observer & chromaticity combination (6 repeats), the intra-observer variations are quantified by the standard deviation (std.) and coefficient of variation (CV), both of which are averaged over observers in Table 2 to mainly show the chromaticity dependency. CV values, by normalizing the standard deviation to the means, are more consistent across chromaticities. The intra-observer variation is considered large relative to typical brightness matching experiments. Those variations indicated both the difficulty of judging $G_0$ or luminosity and the individual difference. And according to our observers’ feedback, (zero) grayness was less intuitive (see also the discussion on its implicitness in [27]) and more difficult to judge than the glowing threshold, the latter of which might be their primary decision criterion. Previous studies reported different ratios between $G_0$ or luminosity threshold and the background/illumination [28,29] and high inter-observer variations in a similar task [30]. And the individual difference in the $G_0$ results might resemble that in estimating the illumination’s chromaticity [31,32], but in the luminance dimension.

Tables Icon

Table 2. Average $G_0$ results under $200\,cd/m^2$ peak background and intra-observer variations quantified by the standard deviation (std.) and coefficient of variation (CV) averaged across observers.

Furthermore, in a follow-up paired comparison experiment that we previously reported [33], we explicitly asked another group of observers to judge “which stimulus is brighter?” between those average $G_0$ results in Fig. 3(b). Some observers participated in the $G_0$ collection experiment but were not aware of the purpose of the follow-up experiment. The $G_0$ colors were found to be approximately equally bright across different chromaticities as their derived Thurstonian scales were not significantly different. Thus, the hypothesis suggested in [8] that $G_0$ defines the luminance of equal chromatic brightness for various chromaticities is verified, and the H-K effect is automatically incorporated.

From the paired comparison verification as well as the high correlations described in the next section, we believe that the three repetitions along both up and down directions can average out the impacts of temporal adaptation in increment/decrement adjustments. And while there exist inter-observer variations, the results averaged from observers converge to be equally bright, which was acceptable to another group of observers. As luminance on the log scale seems more useful than linear luminance, the average $G_0$ across either trials or observers might be better calculated using geometric means instead of arithmetic means. The comparison was done and the results were generally similar [33]. Given the relatively large variations and that the paired comparison verification was based on the arithmetic means and achieved promising results, we chose to stay with the arithmetic mean results .

2.2.3 Results: $G_0$ versus MacAdam optimal colors

The relation between $G_0$ lightness/brightness threshold and appearance mode boundaries has been discussed by Evans [15] and others [19,29], where the MacAdam’s optimal colors [34] were considered to determine the physical constraint of the observer’s prior of whether a color appears reflective or self-luminous. Fairchild and Heckaman suggested using zero blackness in Natural Colour System (NCS) as a computational proxy for $G_0$ [9]. Figure 4 presents the relations between the collected $G_0$ luminance and the two physical gamut boundaries or luminance thresholds for a given chromaticity, which have high correlations ($r = 0.9390, p<0.001$ and $r = 0.9076, p<0.001$, for optimal colors and NCS zero blackness, respectively). The optimal colors, which are mathematically ideal square-shaped reflectances [34,35], usually have slightly higher luminances than the physically realizable NCS zero blackness with a high correlation between them ($r = 0.9345, p<0.001$). The high correlation between the $G_0$ results and the MacAdam optimal colors’ luminance aligns with the argument in [19,32] that the visual system internalizes, via the interactions with the physical environment, the achievable luminance of a given chromaticity under an (estimated/equivalent) illumination. In turn, it implies that the MacAdam optimal colors can (theoretically) serve as an ecological basis for the $G_0$ results.

 figure: Fig. 4.

Fig. 4. $G_0$ under $200\,cd/m^2$ peak background versus NCS zero blackness and the optimal colors’ luminance under a $200\,cd/m^2$ illumination. The $G_0$ luminances on the x-axis are the averaged psychophysical results in Fig. 3(b), and the luminance thresholds on the y-axis are two physical limits calculated via the methods in [9] and [35], respectively.

Download Full Size | PDF

And there is a ratio of ${\sim }2.6$ between the visual results and the physical results. Perfectly diffuse white has a luminance of $200\,cd/m^2$ under a $200\,cd/m^2$ illumination whereas the neutral chromaticity (Stimulus #16) had a higher luminance to be $G_0$ under our $200\,cd/m^2$ peak background. While there exist different factors causing this discrepancy such as the visual field size, the equivalence or lack thereof between the background and illumination is further discussed in the Discussion section.

2.3 Experiment 2

2.3.1 Stimuli and observer task

Since the $G_0$ results are considered equally bright and correlate with the MacAdam optimal colors, navigating along the surface of optimal colors across different chromaticities can keep brightness constant while changing hue and saturation. To further dissect the equally-bright surface and isolate the separate contributions of hue and saturation to the H-K effect, Experiment 2 used the optimal colors under a $200\,cd/m^2$ D65 that were linearly scaled with a factor of $2.49$, which were expected to be equivalent to the $G_0$ under the background with a $200\,cd/m^2$ peak luminance.

For each hue in Fig. 2, the method of partition scaling was used to derive its uniform saturation scale. With the stimulus configuration in Fig. 1 and the neutral and the max-saturated chromaticity in Fig. 2 as the starting reference anchors, the task for the observer was to adjust the middle patch to be the perceptual midpoint of the two anchors in terms of saturation. The underlying physical dimensions for the hue and saturation decomposition were the dominant wavelength and excitation purity on the CIE xy chromaticity diagram. For the five max-saturated hues R (red) / Y (yellow) / G (green) / B (blue) / P (purple), the dominant wavelength was $629$, $576$, $512$, $485$, $560$ nm (the complementary), respectively. The center patch, when being adjusted, had a constant dominant wavelength but varying excitation purity between the two anchors, with a fine step of $0.005$ and a large step of $0.015$ excitation purity difference. The starting point for each trial was randomly set between the two anchors, and there was no time limit for the observer in each trial. The observer first made adjustments for finding the 50% saturation, repeated five times with a random order among all the hues. Then, the averaged results became the new anchor for collecting 25 & 75% saturation levels, with the neutral and the max-saturated hue as the other anchors, respectively. Those second-session trials were also repeated five times and in random order. It took each observer about 40 mins to complete all 75 trials.

Before the experiment, a training session for explaining the definition of saturation was included by showing the observer the physical DIN 6164 color samples [36]. In particular, they were instructed to get an impression of, not necessarily memorize, how saturation can be scaled across different hues. The digital version of the DIN system [37] was also shown. A practice session was done before the formal experiment until the observer felt comfortable with the task. Since the partition scaling was done within each hue, to obtain the absolute saturation levels across the five hues, a supplementary experiment via magnitude estimation [38] was performed after the observer finished the partition scaling task. Given the neutral and the max-saturated red as references for saturation of $0$ and $100$, respectively, the observer was asked to assign a positive number (that could exceed $100$ if necessary) to the other max-saturated hues as well as all 50%-saturated hues collected via the partition scaling part. Each estimation was repeated twice and in random order. In total, 14 observers (10 M & 4 F; average age of 31 with a standard deviation of 9) participated in this experiment, and 13 of them had a color science background and a normal color vision. 7 of them also participated in the first experiment.

2.3.2 Results: saturation versus excitation purity

Using the magnitude estimated relative to the reference of max-saturated red as 100, the rest of 75, 50, and 25% saturation is scaled to each hue’s maximum 100% saturation. For the average observer, the estimated magnitudes were averaged across the observers, and the median excitation purity from all observers’ trials was taken from those 75, 50, and 25% saturation adjustment results. Figure 5 shows the saturation magnitude as a function of excitation purity for each hue, which presents roughly linear relations with different slopes. This confirms that excitation purity can be an approximate metric describing saturation for a given hue. However, the difference in their slopes reflects the anisotropy across different hues (or dominant wavelengths). Compared with the saturation estimates for equally bright monochromatic stimuli (Fig. 9) in [22], only green in our results seems particularly inconsistent, that is, if we assume the linearity can hold for extrapolating to higher excitation purity, the predicted saturation for the monochromatic green would be higher than other hues. This inconsistency might come from the limited color gamut in our stimulus set in that such extrapolation would not be guaranteed and in that the observers might overestimate. The individual observer’s result was analyzed, similar to Fig. 5, and the slope order of the five hues was mostly similar across observers, meaning the overestimation, if any, was shared by our observers.

 figure: Fig. 5.

Fig. 5. Saturation against excitation for each hue. Each dot corresponds to the average observer’s saturation estimate relative to the adjustment result of excitation purity. The horizontal error bars indicate the inter-observer variability of the partition scaling adjustments, and the vertical error bars correspond to the variability of the estimated saturation magnitudes. The total length of each error bar corresponds to one standard deviation for visualization clarity.

Download Full Size | PDF

2.3.3 Results: separating hue and saturation contributions to the H-K effect

While the results in the previous section are helpful in predicting saturation for different chromaticities that are equally bright, by considering the luminance difference in the MacAdam optimal colors, the contributions from hue and saturation to the H-K effect can be better quantified. Figure 6 plots the optimal colors’ luminance decreasing as saturation or excitation purity increases. Zero saturation corresponding to the neutral has an optimal luminance of about $500\,cd/m^2$. Given a saturation level, for example, 50, the decreasing magnitudes relative to the neutral are high for red and purple but low for yellow and green. This decreasing magnitude is equivalent to the traditional B/L ratio metric for quantifying the H-K effect. In other words, for those hues with the same saturation, red and purple have a stronger H-K effect. Compared with the results in [39], where the chromatic strength for the same five Munsell hues was reported (Fig. 1), the order in our results seems consistent at first sight. However, Nayatani’s stimuli had the same Munsell chroma with different Munsell values for being equally bright. The chromatic strength could be a reflection of saturation (chroma over lightness) instead, which we have equalized over hues. Note that perceptual hue is approximately described by dominant wavelength here, and our observers did not report a noticeable hue change during the adjustments. Since those hue names can include other dominant wavelengths as well, therefore, hue scaling is necessary when more fine-grained quantification in the H-K effect is needed.

 figure: Fig. 6.

Fig. 6. The optimal color luminance against saturation estimate for each hue. The saturation scale of each dot corresponds to the y-axis in Fig. 5. The luminance on the y-axis was calculated from the MacAdam optimal colors when designing the stimuli.

Download Full Size | PDF

3. Discussion

3.1 $G_0$ under different backgrounds

The stimulus background was expected to provide a stable adaptation state. To test other adaptation levels [40] and potentially develop absolute brightness scales accordingly, the random neutral pattern with the same uniform lightness distribution but two different peak luminances, $50$ and $100\,cd/m^2$, was used. All the observers in Experiment 1 repeated the same task under the three backgrounds in random order. And the stimulus’s high starting point in each trial was scaled to a half and a quarter of the gamut limits, respectively, so that the stimuli did not appear over-bright while still glowing and the observer could lower the luminance more efficiently. The adjustment steps were set in a similar way.

Figure 7 shows the $G_0$ under across adaptation backgrounds, plotted as peak $100$ or $200\,cd/m^2$ versus peak $50\,cd/m^2$. They both exhibit high linear correlations ($r = 0.9835, p<0.001$ and $r = 0.9842, p<0.001$, respectively). However, the linearity did not follow the $Y/Y_n$ invariant ($Y_n$ as the reference white), expected to follow the two dash lines, as $G_0$ under each adaptation was supposed to have same lightness [9]. Lower slopes might suggest $G_0$ already reached a saturated state of lightness, and the relation between the background used in the experiment and a computational $Y_n$ (“what is white?”) need further investigation.

 figure: Fig. 7.

Fig. 7. $G_0$ luminances under different backgrounds. The luminances on the x-axis are the $G_0$ luminances under the background with peak pixels of $50\,cd/m^2$. The luminances on the y-axis of the solid circles correspond to the $100\,cd/m^2$ background, and those of the diamonds correspond to the $200\,cd/m^2$ background. Each chromaticity in Fig. 2 is color-coded approximately. The dash lines serve as a reference if there is linear scaling between different backgrounds. The Pearson correlation coefficients ($r$) are indicated.

Download Full Size | PDF

We also did a follow-up experiment by adding a color-checker below the target patch, which might provide more image cues [41] and possible color references, such as its white patch. Under the $200\,cd/m^2$ peak background, the color patches on the color-checker corresponded to their spectral reflectances under a $200\,cd/m^2$ D65. A physical color-checker was shown to the observer under a similar level of illumination. Three observers from Experiment 1 and another observer did this experiment. Only the middle-level chroma for each hue and the neutral were tested with two repeats for different starting points. And a background without a color-checker, as done in Experiment 1, was repeated for baseline comparison. A vertically mirrored color-checker, where the white patch was closer to the stimulus, was also tested. The order of three conditions, no color-checker, color-checker with the neutral patches in the bottom row, and color-checker vertically mirrored, was randomized. The observer repeatability under the three conditions was very similar. And there was no significant difference in the results averaged across the repeats. The presence of a color-checker might not be helpful in this task, but interestingly, the results for the three observers from Experiment 1 showed a smaller difference to the average $G_0$ in Fig. 3(b).

The usage of the random neutral background was also expected to be equivalent to a uniform neutral background in terms of the average luminance [26] and to have less simultaneous contrast effect [42]. The average luminance may seem simplified for matching a realistic background such as an equivalent illuminated scene like in [19], and there might be different interactions between the background or the stimulus configuration in general and the individual observer [26], as well as variations in how each observer understands/perceives brightness [43,44]. Nevertheless, the results from the paired comparison experiment [33] and another study based on the current average observer’s results [10] confirm that those $G_0$ luminances appear equally bright, at least approximately for the average observer in the two experiments. Note that in [10] a small amount of H-K effect was found for the MacAdam optimal (not exactly $G_0$) colors with varying saturation levels, which in turn highlights the significance of the H-K effect for equal-luminance colors. And the potential reasons for the small H-K effect have been discussed in [10]. How to connect the results in this work with either a uniform background or more complex spatial contexts remains to be answered, and calls for approaches to characterizing the viewing field [8,45].

3.2 Flattening the MacAdam optimal color surface

Our unique design of navigating across the optimal colors in Experiment 2 took advantage of its high correlation with the $G_0$ results. As different chromaticities have different optimal luminances (an irregular tent shape), the property of being equally bright can be visualized as flattening such a surface. The reference diffuse white ($Y_n$) in CIELAB has a nominal lightness of $100$. The neutral stimulus #16 at $G_0$ may likely have a higher lightness, and the other stimuli at $G_0$ have the same value. The normalization using $Y_n$ as a reference in calculating lightness for the grayscale can be generalized to all the chromaticities, where their $G_0$ serves as the anchor, which has been done in the DIN color order system [36] and our recent work [10].

The boundary of the MacAdam optimal colors can be found in two ways, that is, the maximum luminance for a given chromaticity and the maximum saturation for a given combination of luminance and dominant wavelength. The former was primarily used in both Experiment 1 (the brilliance dimension) and 2 (finding the optimal luminance along stimulus adjustments). And those maximum/optimal luminances have equal brightness as reference anchors. However, there is no such connection for saturation. The maximum saturation for each (monochromatic) hue is not equal, according to both previous work and our results in Fig. 5. This is similar to the Munsell system not achieving a sphere shape and probably suggests the primacy of brightness over saturation, especially from the ecological perspective of surface versus illuminant colors. The asymmetry might have implications for HDR-WCG displays that expand in both luminance and chromatic dimensions. In addition, the physical constraint in the optimal colors translates to the trade-off between luminance and purity in reflective colors, which is not the same as the trade-off between brightness and saturation. The stimuli in Experiment 2 varied their saturation at a constant brightness, which is not a natural gradient of pigment mixing [46]. Although the optimal colors are less physically realistic for reflective materials than zero-blackness NCS samples, their differences were not significant, and adding smoothness constraints on the optimal colors is also doable [19,47]. The extrapolation to the extreme cases of monochromatic colors should be further tested since their frequency of exposure in daily experiences is certainly lower.

4. Conclusion

To incorporate the H-K effect and address the discrepancy between brightness and luminance, in this work, the concepts of brilliance and zero grayness were revisited by two psychophysical experiments with stimuli rendered on an HDR and WCG display. Specifically, in Experiment 1, the $G_0$ luminances for a group of representative chromaticities were collected, which were verified as equally bright and were found to highly correlate with two different physical gamut boundaries including the MacAdam optimal colors. In Experiment 2, the equally bright MacAdam optimal colors were used as stimuli for saturation scaling, and the relations between saturation and physical purity and the H-K effect were analyzed. The results reported in this paper have also been used in the investigation of the independent relation between brightness and saturation [10], which is promising for a better color representation. As brightness is widely useful for various applications such as color management and image processing, lighting, and display optimization, future work includes the benchmarking evaluations of our results for those potential applications relative to the existing brightness metrics.

Acknowledgments

Andreas Kraushaar and Neville Smith kindly provided the physical and digital DIN samples, respectively, that were used in the saturation scaling experiment. We are grateful to all the observers for their input. Part of this work was presented at the conferences of CIC 2021 [33] and VSS 2022 [48].

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper can be found at [49].

References

1. P. Lennie, J. Pokorny, and V. C. Smith, “Luminance,” J. Opt. Soc. Am. A 10(6), 1283 (1993). [CrossRef]  

2. J. Koenderink, A. van Doorn, and K. Gegenfurtner, “Color weight photometry,” Vision Res. 151, 88–98 (2018). [CrossRef]  

3. R. M. Boynton, “Ten years of research with the minimally distinct border,” Visual Psychophysics and Physiology: A Volume Dedicated to Lorrin Riggs p. 193 (1978).

4. S. Shioiri and P. Cavanagh, “Achromatic form perception is based on luminance, not brightness,” J. Opt. Soc. Am. A 9(10), 1672–1681 (1992). [CrossRef]  

5. P. K. Kaiser and G. Wyszecki, “Additivity failures in heterochromatic brightness matching,” Color Res. Appl. 3(4), 177–182 (1978). [CrossRef]  

6. M. Ayama and M. Ikeda, “Brightness-to-luminance ratio of colored light in the entire chromaticity diagram,” Color Res. Appl. 23(5), 274–287 (1998). [CrossRef]  

7. D. Corney, J.-D. Haynes, G. Rees, and R. B. Lotto, “The brightness of colour,” PLoS One 4(3), e5091 (2009). [CrossRef]  

8. M. D. Fairchild, Color Appearance Models, Wiley-IS&T Series in Imaging Science and Technology (Wiley, Chichester, West Sussex, 2013) 3rd ed.

9. M. D. Fairchild and R. L. Heckaman, “Deriving appearance scales,” in Color and Imaging Conference, (Society for Imaging Science and Technology, 2012), pp. 281–286.

10. H. Xie and M. D. Fairchild, “Representing color as multiple independent scales: brightness versus saturation,” J. Opt. Soc. Am. A 40(3), 452–461 (2023). [CrossRef]  

11. Y. Nayatani and H. Sakai, “A relationship between zero-grayness luminance by Evans and perceived brightness of spectrum colors,” Color Res. Appl. 33(1), 19–26 (2007). [CrossRef]  

12. M. D. Fairchild and E. Pirrotta, “Predicting the lightness of chromatic object colors using CIELAB,” Color Res. Appl. 16(6), 385–393 (1991). [CrossRef]  

13. S. Hermans, K. A. G. Smet, and P. Hanselaer, “Color appearance model for self-luminous stimuli,” J. Opt. Soc. Am. A 35(12), 2000 (2018). [CrossRef]  

14. R. M. Evans and B. K. Swenholt, “Chromatic Strength of Colors: Dominant Wavelength and Purity,” J. Opt. Soc. Am. 57(11), 1319 (1967). [CrossRef]  

15. R. M. Evans, “Fluorescence and Gray Content of Surface Colors,” J. Opt. Soc. Am. 49(11), 1049 (1959). [CrossRef]  

16. Y. Nayatani, H. Sobagaki, and K. Hashimoto, “Relation between Helmholtz-Kohlrausch Effect, Purity Discrimination, and G0 Function,” J. Light Visual Environ. 17(2), 16–24 (1993). [CrossRef]  

17. R. L. Heckaman, “Brilliance, contrast, colorfulness, and the perceived volume of device color gamut,” Ph.D. thesis, Rochester Institute of Technology (2008).

18. R. L. Heckaman and M. D. Fairchild, “G0 and the gamut of real objects,” in Proceedings of AIC Color, (2009).

19. J. M. Speigle and D. H. Brainard, “Luminosity thresholds: Effects of test chromaticity and ambient illumination,” J. Opt. Soc. Am. A 13(3), 436 (1996). [CrossRef]  

20. T. Fujine, T. Kanda, Y. Yoshida, M. Sugino, M. Teragawa, Y. Yamamoto, and N. Ohta, “Relationship between mode-boundary from surface color to fluorescent appearance and preferred gamut on wide-gamut displays,” J. Soc. Inf. Disp. 18(8), 535 (2010). [CrossRef]  

21. M. Bertalmío, Vision Models for High Dynamic Range and Wide Colour Gamut Imaging: Techniques and Applications, Computer Vision and Pattern Recognition (Academic Press, 2019).

22. K. Uchikawa, H. Uchikawa, and P. Kaiser, “Luminance and saturation of equally bright colors,” Color Res. Appl. 9(1), 5–14 (1984). [CrossRef]  

23. Apple Inc., “White paper: Pro display XDR technology overview,”, Apple (2020) [retrieved Sep. 2020], https://www.apple.com/pro-display-xdr/pdf/Pro_Display_White_Paper_Feb_2020.pdf.

24. Apple Inc., “The Metal framework,” Apple (2020) [retrieved Sep. 2020], https://developer.apple.com/metal/.

25. E. A. Day, L. Taplin, and R. S. Berns, “Colorimetric characterization of a computer-controlled liquid crystal display,” Color Res. Appl. 29(5), 365–373 (2004). [CrossRef]  

26. M. D. Fairchild, “A victory for equivalent background–on average,” in Color and Imaging Conference, (Society for Imaging Science and Technology, 1999), pp. 87–92.

27. Y. Nayatani and H. Sakai, “Gray and grayness-its complexities in color appearance of surface colors,” Color Res. Appl. 39(1), 37–44 (2014). [CrossRef]  

28. F. Bonato and A. L. Gilchrist, “Perceived area and the luminosity threshold,” Perception & Psychophysics 61(5), 786–797 (1999). [CrossRef]  

29. K. Uchikawa, K. Koida, T. Meguro, Y. Yamauchi, and I. Kuriki, “Brightness, not luminance, determines transition from the surface-color to the aperture-color mode for colored lights,” J. Opt. Soc. Am. A 18(4), 737–746 (2001). [CrossRef]  

30. C.-W. Lin, P. Hanselaer, and K. A. Smet, “Relationship between perceived room brightness and light source appearance mode in different media: reality, virtual reality and 2d images,” in Color and Imaging Conference, (Society for Imaging Science and Technology, 2020), pp. 30–35.

31. A. D. Winkler, L. Spillmann, J. S. Werner, and M. A. Webster, “Asymmetries in blue–yellow color perception and in the color of ’the dress’,” Curr. Biol. 25(13), R547–R548 (2015). [CrossRef]  

32. T. Morimoto, A. Numata, K. Fukuda, and K. Uchikawa, “Luminosity thresholds of colored surfaces are determined by their upper-limit luminances empirically internalized in the visual system,” J. Vis. 21(13), 3 (2021). [CrossRef]  

33. H. Xie and M. D. Fairchild, “G0 revisited as equally bright reference boundary,” in Color and Imaging Conference, (Society for Imaging Science and Technology, 2021), 29, pp. 247–252.

34. D. L. MacAdam, “Maximum visual efficiency of colored materials,” J. Opt. Soc. Am. 25(11), 361–367 (1935). [CrossRef]  

35. K. Masaoka, “Fast and accurate model for optimal color computation,” Opt. Lett. 35(12), 2031–2033 (2010). [CrossRef]  

36. M. Richter and K. Witt, “The story of the DIN color system,” Color Res. Appl. 11(2), 138–145 (1986). [CrossRef]  

37. N. S. Smith and M. D. Fairchild, “Virtual colour atlas,” Color Res. Appl. pp. 1–10 (2022).

38. R. Cao, M. Castle, W. Sawatwarakul, M. Fairchild, R. Kuehni, and R. Shamey, “Scaling perceived saturation,” J. Opt. Soc. Am. A 31(8), 1773–1781 (2014). [CrossRef]  

39. Y. Nayatani, “Why two kinds of color order systems are necessary?” Color Res. Appl. 30(4), 295–303 (2005). [CrossRef]  

40. J. C. Stevens and S. S. Stevens, “Brightness function: Effects of Adaptation,” J. Opt. Soc. Am. A 53(3), 375–385 (1963). [CrossRef]  

41. J. M. Kraft, S. I. Maloney, and D. H. Brainard, “Surface-illuminant ambiguity and color constancy: Effects of scene complexity and depth cues,” Perception 31(2), 247–263 (2002). [CrossRef]  

42. V. Ekroll and F. Faul, “A simple model describes large individual differences in simultaneous colour contrast,” Vision Res. 49(18), 2261–2272 (2009). [CrossRef]  

43. Y. Nayatani and H. Sobagaki, “Causes of individual differences on brightness/luminance (b/l) ratios,” J. Light Visual Environ. 27(3), 160–164 (2003). [CrossRef]  

44. A. J. Zele, P. Adhikari, B. Feigl, and D. Cao, “Cone and melanopsin contributions to human brightness estimation,” J. Opt. Soc. Am. A 35(4), B19 (2018). [CrossRef]  

45. R. F. Murray, “Lightness perception in complex scenes,” Annu. Rev. Vis. Sci. 7(1), 417–436 (2021). [CrossRef]  

46. R. S. Berns, “Extending CIELAB: Vividness, depth, and clarity,” Color Res. Appl. 39(4), 322–330 (2014). [CrossRef]  

47. S. A. Burns, “Numerical methods for smoothest reflectance reconstruction,” Color Res. Appl. 45(1), 8–21 (2020). [CrossRef]  

48. H. Xie and M. D. Fairchild, “Isolating saturation and hue for equally bright colors,” in VSS Annual Meeting, (Vision Sciences Society, 2022).

49. H. Xie and M. D. Fairchild, “Deriving and dissecting an equally bright reference boundary,” OSF (2023), http://doi.org/10.17605/OSF.IO/YF6MW.

Data availability

Data underlying the results presented in this paper can be found at [49].

49. H. Xie and M. D. Fairchild, “Deriving and dissecting an equally bright reference boundary,” OSF (2023), http://doi.org/10.17605/OSF.IO/YF6MW.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (7)

Fig. 1.
Fig. 1. Stimulus configuration in the experiments. Each color patch covered a 3-by-3 deg visual field with 1-deg separations between them. In Experiment 1, only the center patch was shown and adjusted to the $G_0$ level. The triplet of color patches was used in Experiment 2 where the observer adjusted the center patch to be the perceptual midpoint between the two flanking patches in terms of saturation. The background had random neutrals at different lightness levels.
Fig. 2.
Fig. 2. Stimulus chromaticities versus display gamut. Five Munsell hues at three different levels of chroma plus a neutral D65 were included.
Fig. 3.
Fig. 3. $G_0$ luminances under $200\,cd/m^2$ peak background. In 3(a), each dot represents one individual observer’s result averaged across six repeats for the stimulus on the x-axis where the index corresponds to 1. Each observer is represented by different marker symbols in the legends. The $G_0$ luminances averaged across the observers are plotted in 3(b). The error bars indicating the inter-observer variations represent the standard deviations of those individual $G_0$ luminances.
Fig. 4.
Fig. 4. $G_0$ under $200\,cd/m^2$ peak background versus NCS zero blackness and the optimal colors’ luminance under a $200\,cd/m^2$ illumination. The $G_0$ luminances on the x-axis are the averaged psychophysical results in Fig. 3(b), and the luminance thresholds on the y-axis are two physical limits calculated via the methods in [9] and [35], respectively.
Fig. 5.
Fig. 5. Saturation against excitation for each hue. Each dot corresponds to the average observer’s saturation estimate relative to the adjustment result of excitation purity. The horizontal error bars indicate the inter-observer variability of the partition scaling adjustments, and the vertical error bars correspond to the variability of the estimated saturation magnitudes. The total length of each error bar corresponds to one standard deviation for visualization clarity.
Fig. 6.
Fig. 6. The optimal color luminance against saturation estimate for each hue. The saturation scale of each dot corresponds to the y-axis in Fig. 5. The luminance on the y-axis was calculated from the MacAdam optimal colors when designing the stimuli.
Fig. 7.
Fig. 7. $G_0$ luminances under different backgrounds. The luminances on the x-axis are the $G_0$ luminances under the background with peak pixels of $50\,cd/m^2$. The luminances on the y-axis of the solid circles correspond to the $100\,cd/m^2$ background, and those of the diamonds correspond to the $200\,cd/m^2$ background. Each chromaticity in Fig. 2 is color-coded approximately. The dash lines serve as a reference if there is linear scaling between different backgrounds. The Pearson correlation coefficients ($r$) are indicated.

Tables (2)

Tables Icon

Table 1. Stimulus chromaticities and their Munsell specifications (when at particular relative luminances Y ).

Tables Icon

Table 2. Average G 0 results under 200 c d / m 2 peak background and intra-observer variations quantified by the standard deviation (std.) and coefficient of variation (CV) averaged across observers.

Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.