## Abstract

We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and *in vivo* fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.

© 2015 Optical Society of America

## 1. Introduction

Functional near-infrared spectroscopy (fNIRS) is an optical technique able to noninvasively monitor the cerebral hemodynamic at cortical level [1,2]. Exploiting the relatively low absorption of biological tissues, light in the red and near-infrared wavelength range can penetrate the human head down to some centimeters and reach the cerebral cortex. Therefore, fNIRS can provide a measure of oxy- and deoxy-hemoglobin (O_{2}Hb and HHb, respectively), the main chromophores contributing to light absorption at this wavelength range.

Various techniques are proposed in literature to enhance and detect the fNIRS signal, irrespective of the specific hardware implementation (e.g. continuous wave, frequency domain, or time domain regime) of the fNIRS technique. However, because of the relative novelty of the fNIRS technique, a common accepted framework for data analysis has not yet been agreed. The first public domain software package for fNIRS data analysis is Homer (acronym for Hemodynamic Evoked Response) [3]. The software provides a graphical user interface and Matlab scripts for both the preprocessing and the standard statistics on fNIRS data. Homer has been upgraded and the new release Homer2 supports more easily group analyses and re-configuration of the processing stream, and it integrates users algorithms into the processing stream. Another free software is functional Optical Signal Analysis (fOSA) [4], which offers Matlab based functions for a basic analysis of fNIRS data, incorporating several filters for signal denoising and providing also the Statistical Parametric Mapping (SPM) methodology for statistical analysis based on the general linear model (GLM) approach. More focused on the development of SPM routines is the non-commercial software NIRS-SPM [5]. A novelty introduced by this program is represented by a voxel based alignment between interpolated maps instead of an inter-subjects realignment of optodes, in order to facilitate the group analysis. Further it proposes a new theory to deal with the multiple comparison problem for the p-value correction [6]. Another software is NIRS analysis package (NAP) [7], which allows noise removal and GLM analysis, as well as anatomical registration of the measurements. fNIRSOFT is a stand-alone software to process, analyse and visualize fNIRS signals through a graphical user interface and/or scripts distributed by BIOPAC Systems, Inc [8]. Finally POTATo (Platform for Optical Topography Analysis Tools) is a software package for fNIRS signal processing and analysis, developed by Hitachi, Ltd [9]. A comprehensive list of software can be found in the website of the fNIRS Society [10].

The above mentioned tools share some common procedures for preprocessing the fNIRS data and for extracting the features of interest. Given that the functional studies are usually performed by repeating a particular task during several temporal slots, a preliminary way to inspect the fNIRS results is to detrend the signal by subtracting a mean value registered in a rest period before each task repetition and then to visualize the time series of O_{2}Hb and HHb signals as the average over all the trials. The expected hemodynamic response is identified where the O_{2}Hb concentration increases and simultaneously the HHb decreases. Often the functional activity could be not easily detected because of the simultaneous presence of confounding effects like hemodynamic changes in the superficial layer (either systemic or task related), and movement artefacts. Therefore a careful data analysis and statistical inference must be considered to properly detect the signals related to a neuronal activation [11–13]. Several methods to correct fNIRS signals are proposed and they are based on bandpass filtering or on data decomposition by means of principal component analysis (PCA) or independent component analysis (ICA) [14,15]. Another approach in GLM analysis is measuring and explicitly modelling the physiological confounds, inserting them as nuisance regressors [16].

For statistical inference, the GLM approach has been adopted in most of the fNIRS software tools. GLM is a regression model assuming that a functional of a given signal can be modeled as a linear combination of known regressors, usually consisting in task-related boxcar functions. GLM was originally adopted for fMRI data analysis [17], and later used in fNIRS, taking advantage of the similarity between fMRI and fNIRS experiments in terms of design and hypotheses [4,18]. Anyway, because of the substantial differences between fMRI and fNIRS techniques, above all the lower fNIRS spatial resolution due to the sparse optodes distribution over the head, there are a number of limitations, assumptions and specific issues that have to be considered when applying GLM on optical data [19].

In this paper we propose a new statistical method for the fNIRS data analysis to statistically discriminate the hemodynamic activations. The proposed method is based on a linear regression model, and the main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data sets for the application of statistical tests, such as the assumptions of independence, homoscedasticity and Gaussianity of residuals in order to guarantee the Gaussianity of the coefficients estimators. Furthermore, we propose a clustering algorithm aiming at a better localization of the activated vs. not activated channels. The method has been tested on simulated data mimicking real fNIRS measurements, and then applied on *in-vivo* measurements. In particular, we focused on time domain (TD) fNIRS data since this technique allows the best discrimination between cortical activation and superficial physiological changes. However, the proposed method is also suitable for continuous wave or frequency-domain fNIRS data sets.

## 2. Materials and methods

#### 2.1 Synthetic fNIRS data

A synthetic fNIRS data set has been created to mimic real multichannel TD fNIRS measurements on a healthy adult during a motor task (handgrip experiment). A forward procedure and an inverse procedure were the main steps to obtain the synthetic data set.

The forward procedure consists in: 1) defining the head geometry; 2) assigning values for the hemodynamic parameters so as to calculate the hemodynamic response functions during a protocol; 3) converting hemodynamic parameters into absorption coefficients; 4) adding the information on the reduced scattering coefficients; 5) using a photon diffusion model to generate distributions of photons time-of-flight; 6) adding measurement noise to mimic a real TD fNIRS measurement.

The head has been modeled as a bilayered medium, where the upper layer is 1 cm thick, and the lower layer is ideally a semi-infinite medium. To a first approximation in fact this geometry can be used to simulate fNIRS measurements on the adult head, where an extra-cerebral layer (composed by scalp, skull and cerebrospinal fluid) overlays the intra-cerebral one (gray and white matter). The thickness of the upper layer presents a wide range of values depending on subjects and on the measurement position (skin-cortex distance spanning from 7.25 to 20 mm) [20]. However a recent study proposed by Strangman et al. [21] provides tabulated values of the scalp and skull thicknesses for all the positions of the 10/20 EEG standard system, based on the segmentation of the MRI-based head model “Colin27” [22]. In the areas related to our experiment the reported thickness values together with a mean thickness value of the cerebrospinal fluid [20] confirm the choice of 1cm as average distance between scalp and brain cortex.

Hemoglobin concentrations were simulated by considering reference values of 12 μM for the O_{2}Hb and 7 μM for the HHb in the superficial layer, and reference values of 30 μM for the O_{2}Hb and 20 μM for the HHb in the lower layer [23]. A data set for 30 independent channels was generated, considering an optodes distribution over both central hemispheres (see Fig. 1).

Two different experiments were then simulated.

The first experiment (EXP1) considers an ideal situation where a neuronal activation is generated in the cortical layer by a motor task, and no physiological oscillations occur in the superficial layer. This experiment consists of 10 repetitions (also called trials) of 10 s of baseline, 20 s of right handgrip movement and 10 s of recovery, for an overall experiment duration of 400 s. The O_{2}Hb and HHb concentrations in the upper layer were simulated constant at the reference values during the whole experiment, while the concentrations in the lower layer were perturbed in some channels so as to mimic a hemodynamic response in correspondence to the task periods. This superimposed response profile was calculated as a convolution of a boxcar function, representing the task and rest alternation, with the Hemodynamic Response Function (HRF) evoked by a single stimulus. By following the method proposed by Scarpa et al. [24], the HRF was modeled as a linear combination of two different gamma-variant time-dependent functions *Γ _{n}*:

_{1}and τ

_{2}regulate the HRF shape, ρ

_{1}and ρ

_{2}tune its scale, while β determinates the undershoot.

*p*coefficient was set to 5 as suggested by Glover et al. [25]. Variability in amplitude of about 5% was considered among the different repetitions to account for possible differences in the execution of the task and/or in the functional response (e.g. habituation effects). The HRF peak for the O

_{2}Hb was chosen to be around 1555 ± 75 nM; the HRF for the HHb was inverted and with a maximum set at −1/3 with respect to the O

_{2}Hb response. The free parameters have been chosen so as to create a HRF similar to the one expected for the motor task of interest (α = 1282, β = 0.17, τ

_{1}= 1, τ

_{2}= 1, ρ

_{1}= −0.5, ρ

_{2}= 3.5). To simulate an actual neuronal activation localized around the central positions of the hemisphere contralateral to the movement, some channels were considered activated with different intensities: number 16 (25% HRF), 17 (100% HRF), 18 (50% HRF), 21 (50% HRF), 28 (25% HRF), 29 (50% HRF); in the other channels no hemodynamic response was added.

In a second experiment (EXP2) the reference values for both layers and for the hemodynamic changes happening in the lower layer were identical to EXP1. However a physiological noise was added in the superficial layer by following the procedure reported by Scarpa et al. [24]. An oscillation was built for O_{2}Hb signal for one channel as a superposition of sinusoidal functions at different mean frequencies and amplitudes, as reported in Table 1:

*f*vary in the channel for every repetition in the range described in Table 1, while phases ${\varphi}_{i}$ are equally distributed between 0 and 2π and are different for each trial.

_{i}The generated signal was then replicated for all the channels by modifying the oscillation amplitude of a random value between ± 10%. The HHb variations have been generated by threefold reducing the magnitude of the physiological noise simulated for the O_{2}Hb.

The absorption coefficients at two wavelengths (690 and 820 nm) for both layers were computed from these hemoglobin concentration changes by exploiting the Lambert Beer law and the a priori knowledge of the specific O_{2}Hb and HHb absorption [26].

The scattering coefficients at the same wavelengths were derived from a simple approximation of the Mie theory [27]:

by fixing the scattering amplitude*a*and the power

*b*respectively at 12 cm

^{−1}and 0.5 for the upper layer, and at 12 cm

^{−1}and 1 for the lower layer, for a reference λ

_{0}at 660 nm [28]. A forward model for photon diffusion in a bilayered geometry [29] was used to generate synthetic time-resolved reflectance (TRR) curves for each channel by using as input parameters the optical properties and the source detector distance (fixed at 3 cm). A count rate of 5·10

^{5}ph/s was considered, the integration time was set at 1 s, and Poisson noise was added to the simulated curves to mimic real measurements.

The inverse procedure involved the following steps: 1) estimating the baseline optical properties and the absorption changes in the upper and lower layer; 2) calculating the hemodynamic parameters from the absorption coefficients.

The absolute values of *μ _{a}* and

*μ*’ have been recovered by TD fNIRS data by fitting the curves of the baseline period preceding each task with a physical model for reflectance geometry in a homogeneous medium [29]. Then absorption changes have been computed by means of the method proposed by Zucchelli

_{s}*et al.*[30]. The method allows the discrimination of superficial and deep absorption variations from TRR curves. It takes into account the effect of system set-up, as described by the Instrument Response Function (IRF) and the heterogeneous structure of the human head for the refined computation of the photon time-dependent pathlengths within each layer the tissue is composed of. It makes use of an approach based on time-gating of the photon time-of-flights distribution. The fNIRS signal coming from the deep regions, more likely involved in the cerebral activity, can be corrected from the superficial variations of the absorption properties, mainly due to systemic hemodynamics changes. Figures 2 and 3 show the time courses (folding average) for changes in O

_{2}Hb and HHb for the EXP2 in the upper layer and in the lower layer, respectively.

#### 2.2 In vivo fNIRS data

The proposed method has been applied on *in vivo* data, acquired by a multi-channel dual-wavelength TD fNIRS medical device developed at Politecnico di Milano, Department of Physics [31], to preliminary validate its performances in real life settings.

One right-handed healthy subject (male, 44 years old) underwent the experiment, consisting in a motor task (i.e. squeezing a soft ball in the right hand) at a rate of 2 Hz guided by a metronome. The same protocol simulated for synthetic data was maintained for *in vivo* experiment (10 repetitions of 10 s baseline, 20 s task and 10 s recovery, total duration 400 s). Instructions about the movement and rest were given by presenting a picture on a screen, which always had a fixation cross in the center. A total of fifteen detection bundles and eight light sources were positioned over the sensorimotor areas centered on C3 and C4 positions of the 10/20 standard system, following the configuration represented in Fig. 1. Pairs of light sources were sequentially illuminated in the left and right hemisphere every 0.25 s allowing the acquisition of 30 measurement points (channels) with an overall acquisition time of 1 s. Hemodynamic parameters were estimated by following the same steps previously described as “inverse procedure”.

The experiment was part of another study [32] that was reviewed and approved by the local ethics committee and it was conducted in compliance with the Declaration of Helsinki; the subject signed a written informed consent to their participation in the study.

Folding average results for changes in O_{2}Hb and HHb in the superficial and lower layer are shown in Figs. 4 and 5 respectively.

#### 2.3 Statistical analysis

The proposed regression model consists in the following steps: 1) pre-processing; 2) estimate of the regressors coefficients for each trial; 3) inference test on the coefficients; 4) K-means algorithm for cluster analysis of activated channels. The method has been implemented in R Core Team (2014) [33], on a standard computer (Intel Core i5, 6GB RAM).

A pre-processing algorithm is initially applied to the fNIRS data sets. The sample mean, calculated on the first 10 s of each trial, is subtracted to the related trial, in order to detrend data. Then a smoothing spline algorithm is applied to the whole signal. If ${y}_{t}$ is hemoglobin concentration at time $t=1:400$s, the algorithm calculates the curve $\widehat{y}\left(t\right)$ that minimize (in the class of twice differentiable functions) the following quantity:

The first term represents the estimated Mean Squared Error (MSE) when using $\widehat{y}\left(t\right)$ to estimate${y}_{t}$. The second term penalizes the curvature of$\widehat{y}\left(t\right)$. The parameter $\xi $ controls the trade-off between the accuracy of $\widehat{y}\left(t\right)$ (for $\xi $ = 0 it corresponds to the original data) and how it is smoothed. Thus data are estimated through the smoothing spline $\widehat{y}\left(t\right)$ that minimizes a weighted sum of MSE and the average curvature.

Instead of using all the 400 s for a single linear regression model, we divide the time series of each channel in 10 sub-intervals (i.e. repetitions or trials) lasting 40 s (made of 10 s rest, 20 s task, and 10 s rest). We then individually apply a linear regression model to each repetition.

Each trial is the elementary sequence revealing activation. It represents a realization of the same phenomenon. It can therefore be interesting to first analyse each trial independently, writing a regression model for each of them. Then we look for an appropriate quantity that summarizes the information about activation in each sub-interval. We consequently have a sample of size 10 of these quantities for each channel, on which we can make inference. This approach aims at limiting the most critical problems that are found in analysing fNIRS data through a linear (or a generalized linear) regression model: correlation, heteroscedasticity and not Gaussianity of residuals, that question the Gaussianity of the coefficients estimates.

If *i* indicates the sub-interval and *k* indicates the channel, we build for each channel $k=1:30$ the 10 following linear regression models:

Under the hypothesis that O_{2}Hb increases during the task period, the regressors for O_{2}Hb, *rest* and *task*, are obtained through a convolution between the HRF and a step-function equal to 0 in the first and last 10 s and 1 elsewhere for *task* (the opposite for *rest*) (see Fig. 6 left). On the contrary, given that HHb is expected to decrease during the task, the regressor for HHb is built as a convolution between the HRF and a step-function equal to 0 in the first and last 10 s, −1 elsewhere for *task*, the opposite for *rest* (see Fig. 6 right).

For each channel *k* and sub-interval *i*, we then calculate the Ordinary Least Squares estimators for${\beta}^{k}$, as:

Following this procedure we obtain fitted values ${\widehat{y}}^{i,k}.$ more similar to ${y}^{i,k}.$ than the ones found through a single linear regression model. An example is reported in Fig. 7, where fitted values and real data are plotted for regression models on trials (Fig. 7 left) and for a unique regression model (Fig. 7 right). In the first situation the Squared Error (SE) is lower than in the second one (34.1 and 42.4 μM^{2} respectively), in which the obtained coefficients are the same for all trials, preventing any variability. For this reason the approach with regression models on trials well suits also experiments where the intensity of the HRF varies in time.

Moreover, through this approach, we have a population of ${\beta}_{rest}^{k}$ and ${\beta}_{task}^{k}$ for each channel *k* on which we can make inference test. The Gaussianity of ${\widehat{\beta}}_{rest}^{i,k}$ and ${\widehat{\beta}}_{task}^{i,k}$ is first investigated, as well as the Gaussianity of the linear combination${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$, through running, on each channel *k*, three Shapiro-Wilk tests (one on the ten-observations sample$\left({\widehat{\beta}}_{rest}^{1,k},{\widehat{\beta}}_{rest}^{2,k},\dots ,{\widehat{\beta}}_{rest}^{10,k}\right)$, one on$\left({\widehat{\beta}}_{task}^{1,k},{\widehat{\beta}}_{task}^{2,k},\dots ,{\widehat{\beta}}_{task}^{10,k}\right)$, one on their linear combination). P-values are almost always higher than 0.05 in every data set, both simulated and *in vivo*, with O_{2}Hb and HHb measures. An example is reported in Fig. 8. We can therefore assume that, for each fixed channel *k*, ${\widehat{\beta}}_{rest}^{i,k}$ belongs to a normal distribution, as well as ${\widehat{\beta}}_{task}^{i,k}$ and ${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$. In order to discriminate between activated/not-activated channels we focus on the contrast of the coefficients, ${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$, coherently with the current literature [17].

We use the 10 found linear combinations ${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$ and their Gaussianity to implement an inference test. We then build a map that shows the degree of activation of each channel. We conduct a hypothesis test for each channel *k* on the expected value of ${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$ and we use the P-values to draw the map.

In particular, for fixed channel *k*, the test will be:

Theoretically the decision of the test would be the acceptance of the null hypothesis for the not-activated channels, and the rejection for the activated ones. In fact if a channel is activated we expect that the linear combination of the regressors is significant, and the coefficients related to it have expected value higher than 0. Due to the Gaussianity of ${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$ we can calculate for every test the following test statistic:

Under the null hypothesis it is distributed as a t-student with $n-1$ degrees of freedom. Here the sample size, $n$, is equal to 10, ${\overline{X}}_{n}^{k}$ is the sample mean of${\widehat{\beta}}_{task}^{i,k}-{\widehat{\beta}}_{rest}^{i,k}$, and ${S}_{k}^{2}$ is the sample variance. A P-value for each channel can be calculated as$P(X>{T}_{k})$, where $X$ is a random variable from a t-student distribution with $n-1$ degrees of freedom. We finally plot a map (named activation map, see Figs. 9 and following) in which the color of each channel is proportional to its P-value. Colors vary from white (activation) to black (no activation).

#### 2.4 K-means clustering algorithm

The localization of an activated area through a statistical analysis can be confirmed through K-means clustering algorithm [34]. If *k* indicates the channel and *i* the trial, the following vector is considered for each channel:

A K-means algorithm is applied to the 30 vectors in${\mathbb{R}}^{10}$. This clustering algorithm separates the 30 vectors in $m$ groups, finding clusters that minimize the Euclidean distance within clusters and maximizes the one between clusters. We set $m$ equal to 2, because we expect to observe two clusters: one with activated channels, one with not-activated ones.

The algorithm consists of 3 steps:

- 1. Initialization, in which the initial centers are randomly fixed.
- 2. The Euclidean distances from $m$centers are calculated for each vector, then vectors are assigned to the cluster with the nearest center.
- 3. Updating of centers: the center of each cluster is calculated as the mean between the vectors belonging to the cluster.

Steps 2) and 3) are repeated until convergence.

The choice of $m=2$ is confirmed also by the *average silhouette width*, a quality index allowing to select the number of clusters [35]. For each fixed$m$, the index varies from –1 to 1, increasing if the algorithm well classifies the vectors, decreasing if they are badly classified. An index is calculated for K-means with different number of clusters. Then K-means with the highest average silhouette width is chosen. If $m=2$the index is equal to 0.85 for O_{2}Hb, while for $2m7$ we obtain indices lower than 0.40. Similar results are obtained for EXP2. This confirms our choice.

Finally we observe that the K-means algorithm is robust with respect to the initialization value, as reported by Hartigan et al. [34]. Specifically, in step 1) we initialize our analysis from channel 14 and 21, being in the center of the measured areas, without loss of robustness.

## 3. Results

#### 3.1 Synthetic fNIRS data

Activation maps for O_{2}Hb and HHb are reported in Fig. 9 for EXP1 and in Fig. 10 for EXP2.

For data set EXP1 it is clear how the proposed method can discriminate the activated channels in the deeper region. P-values related to activated channels are in fact close to 0 (equal to 0 rounding to the third decimal place), creating a sharp division in the map between white and colored channels. At first sight, channel 30 might seem activated. Its P-value (P = 0.004) is however bigger (by more than a factor 10) than the highest activated-channel P-value (P = 0.0003 for channel 28).

If we consider as activated a channel with P-value lower than 0.001, we note that the map can discriminate exactly the activated channels. More specifically, the lowest P-value (less than 10^{−7}) is the one referred to channel 17, the most activated one (100% HRF), while channels with the lowest activation intensity (ch. 16 and 28, 25% HRF) have the highest P-values among the activated ones (less than 0.0005).

Also in the activation map for the deeper layer of data set EXP2 activated channels are instantly detectable. In the superficial layer P-values are very high (higher than 0.5) and very uniform, confirming that the simulated O_{2}Hb is noise.

We can notice in the HHb activation maps the same trends found for O_{2}Hb. Revealing an activation in this situation is more difficult, because activation amplitude is lower than in O_{2}Hb measures. Nevertheless, this method performs a good channels classification on these measures too: the only channel with a slightly high P-value is 28 (25% HRF).

#### 3.2 In vivo data

The method proposed for activated channels detection was tested also on *in vivo* data. *In vivo* data set was treated in the same way as the synthetic ones. Coefficients estimates were then tested in order to evaluate the dependence between data and regressors. Activation maps were calculated in the same way. This procedure produced good results on *in vivo* data, and it proved to be suitable in revealing activated channels. Activation maps for *in vivo* subject are reported in Fig. 11. In the superficial layer the subject shows high P-values in most of the channels, while active areas can be clearly identified in the left hemisphere of deep layer. This proves the efficacy of the proposed procedure for activation detection, that can be successfully applied also on *in vivo* data.

#### 3.3 K-means

K-means activation maps for synthetic data are reported in Fig. 12 for EXP1 and EXP2. In these images channels from different clusters are represented with different colors. Channels simulated as active (in deep layer) are circled by green.

The K-means algorithm is able to identify most of the activated channels for both O_{2}Hb and HHb in the deep layer. The channels that present higher activation intensity (channels 17, 18, 21, 29) are precisely clustered, for both EXP1 and EXP2. Conversely, channels 16 and 28, that present a low intensity of activation (25% HRF), are assigned to the not-activated cluster. If there is no activation, as happens in the upper layer, the algorithm is not suitable. In fact it is forced to separate channels in two groups, even if vectors shouldn’t be divided in two clusters. Thus results are unpredictable.

Figure 13 shows K-means maps for the *in vivo* subject. The algorithm performs a classification coherent with the activation maps, always assigning the activated channels to the same cluster. We recall that a channel is considered as registering activation in the low layer only if there is a simultaneous increase of oxyhemoglobin and decrease of deoxyhemoglobin. In some situations, if there are one or more highly noisy channels, the algorithm separates them from the others. Thus highly noisy channels must be excluded from the K-means analysis, to avoid clusters composed only by one or two anomalous channels. Excluded channels are reported in pink (their irregularity can be verified looking at folded averages in Figs. 4 and 5).

## 4. Discussion

In this paper a new method for analysis of fNIRS data has been introduced. It is based on linear regression models using, as regressors, convolutions between scale functions and HRF, as in current literature. The main novelty is the data splitting in trial or sub-intervals, each one representing a realization of the elementary activation sequence, and the application of a linear regression model on each of them. In this way, an investigable sample of coefficients is obtained for each channel. In agreement with the literature we specifically focused on a linear combination of the coefficients. We verified the Gaussianity of the linear combinations through Shapiro-Wilks tests. Thus, rather than a unique linear combination for each channel, that is difficult to analyse because of heteroscedasticity, not-normality and correlation of residuals, we have a normal population of linear combinations for each channel, which can be easily investigated through an inference test. For each channel a one-tailed hypothesis test on the expected values of the linear combinations is built, expecting a rejection (acceptance) of the null hypothesis for activated (not-activated) channels. The related P-values are used to draw the activation maps. A further novelty is the use of a clustering algorithm (K-means) as a useful additional instrument in activation detection. K-means algorithm separates channels in two sharp groups: the information on the activation degree of each channel is then lost. The output of the algorithm is a binary assignment for each channel, that is simply labelled as “active/not-active”. Thus it can happen that low activated channels are assigned to the cluster of not-activated ones. This happens for example for channels 16 and 28 in Fig. 12, that present a low intensity of activation (25% HRF). The channels that present a higher activation intensity (channels 17, 18, 21, 29) are precisely clustered, for both EXP1 and EXP2, O_{2}Hb and HHb measures. The clustering algorithm can be an important instrument of control: its right clustering is a further confirmation of the accuracy of the previous analysis and of the calculated activation maps. In fact it follows a different procedure compared to the linear regression model. It aims to detect the same channels applying another kind of analysis, that doesn’t use statistical tests and is based on different hypothesis. The clustering algorithm works well if there are channels with “similar” vectors (thus with a similar evolution in time. This is expected to happen with activated channels). If there is no activation the algorithm is forced to separate channels in two groups, thus results are unpredictable and with no sense. Some problems can also arise if one (or more) channels are particularly noisy and product very heterogeneous vectors: in this situation the K-means algorithm can be inaccurate, separating noisy channels from the others. For this reason highly noisy channels should be excluded before an analysis with K-means clustering algorithm.

A limitation of this study could be the accuracy of the simulated data set. We should in fact observe that the creation of synthetic data well reproducing the superficial and deep O_{2}Hb and HHb concentration changes happening during a succession of rest and task periods is not a trivial issue. As a matter of fact the actual magnitude and frequency components of O_{2}Hb and HHb physiological oscillations occurring concurrently within the s and in the cerebral cortex, during rest or task periods, is never fully reported in literature with adequate precision. Often they are presented only for a range of frequencies, for partial regions of the head, or for physiological parameters different than hemoglobin species (blood flow, pressure or volume components) [36–38]. Moreover, the quantitative definition of the amplitude variation within the brain of both species of hemoglobin following a neuronal activation, i.e. the hemodynamic response, is still an open issue in the scientific community. In fact reported fNIRS data present inaccuracies related to different factors: for instance most of the retrieved concentration values are obtained with CW fNIRS instruments, and thus present an intrinsic measurement error due to the poor depth resolution of the technique: the obtained cerebral signal is inevitably affected by extracerebral concentration variations [39–41]. Further, the *in vivo* optical (absorption and scattering) properties of biological tissues are hardly measurable and data in literature present a high variability in the results [42]. Finally, the different anatomical characteristics within and between subjects produce unavoidable analysis errors. Despite all these limitations the simulated data sets presented in this paper can be effectively used to test the performances of data analysis procedures. Finally, it is worthwhile to observe that the presented work does not aim at solving the specific issue of superficial systemic contamination in fNIRS signal. Rather it is focused on the description and validation of a statistically correct method to identify activated channels. For the method’s application, the input data are supposed to be previously corrected from the superficial contribution: specifically in our case we have taken advantage of TD fNIRS and of simple detrending as pre-processing tool. A future development could include the choice of other filters or algorithms for preprocessing analysis, and a more flexible arrangement of the regressors in the GLM. A family wise correction method for the p-value thresholding could be studied and implemented to account for the spatial coherence of fNIRS data.

## 5. Conclusion

The present study proposes a new procedure for the statistical analysis of the activated channels in fNIRS data. The introduced method minimizes the hypothesis made on data for the application of statistical tests. Thus it can be employed on a wide class of data sets, without losing validity even if high correlation and heteroscedasticity of residuals are proved or their Gaussianity is not verified. All these assumptions are relaxed through a model that provides a sample of activation-related quantities for each channel. The unique required hypothesis is Gaussianity of the activation-related quantities, and this hypothesis was always confirmed by statistical tests. This procedure was validated on a *synthetic* data set and then on *in vivo* data from TD fNIRS, and it produced good results in both situations, detecting activated channels with precision. A clustering algorithm (K-means) is also proposed in the present study as a useful additional tool for activated channels detection. This clustering algorithm doesn’t require any statistical hypothesis. It divides channels in two groups trough geometric considerations on activation-related quantities of each channel. Because of the different proceedings compared to the previous algorithm, K-means can be used as a reinforcing control instrument after the proposed method execution. To our knowledge the proposed method can complement the procedures contained in the most used software for fNIRS data analysis (e.g Homer2 and NIRS-SPM). However, a thorough comparison of the outcomes of different software tools is out of the scope of this paper.

## References and links

**1. **M. Ferrari and V. Quaresima, “A brief review on the history of human functional near-infrared spectroscopy (fNIRS) development and fields of application,” Neuroimage **63**(2), 921–935 (2012). [CrossRef] [PubMed]

**2. **F. Scholkmann, S. Kleiser, A. J. Metz, R. Zimmermann, J. Mata Pavia, U. Wolf, and M. Wolf, “A review on continuous wave functional near-infrared spectroscopy and imaging instrumentation and methodology,” Neuroimage **85**(Pt 1), 6–27 (2014). [CrossRef] [PubMed]

**3. **T. J. Huppert, S. G. Diamond, M. A. Franceschini, and D. A. Boas, “HomER: a review of time-series analysis methods for near-infrared spectroscopy of the brain,” Appl. Opt. **48**(10), D280–D298 (2009). [CrossRef] [PubMed]

**4. **P. H. Koh, D. E. Glaser, G. Flandin, B. Butterworth, A. Maki, D. Delpy, and C. E. Elwell, “Functional Optical Signal Analysis (fOSA): A Software Tool for NIRS Data Processing Incorporating Statistical Parametric Mapping (SPM),” J. Biomed. Opt. **12**(6), 064010 (2007). [CrossRef] [PubMed]

**5. **J. C. Ye, S. Tak, K. E. Jang, J. Jung, and J. Jang, “NIRS-SPM: statistical parametric mapping for near-infrared spectroscopy,” Neuroimage **44**(2), 428–447 (2009). [CrossRef] [PubMed]

**6. **H. Li, S. Tak, and J. C. Ye, “Lipschitz-Killing curvature based expected Euler characteristics for p-value correction in fNIRS,” J. Neurosci. Methods **204**(1), 61–67 (2012). [CrossRef] [PubMed]

**7. **T. Fekete, D. Rubin, J. M. Carlson, and L. R. Mujica-Parodi, “The NIRS Analysis Package: noise reduction and statistical inference,” PLoS ONE **6**(9), e24322 (2011). [CrossRef] [PubMed]

**8. **http://www.biopac.com/fNIR-Software-Professional-Edition.

**9. **http://www.hitachi.co.jp/products/ot/analyze/kaiseki_en.html.

**10. **http://fnirs.org/software.

**11. **S. Tak and J. C. Ye, “Statistical analysis of fNIRS data: A comprehensive review,” Neuroimage **85**(Pt 1), 72–91 (2014). [CrossRef] [PubMed]

**12. **I. Tachtsidis, T. S. Leung, A. Chopra, P. H. Koh, C. B. Reid, and C. E. Elwell, “False positives in functional near-infrared topography,” Adv. Exp. Med. Biol. **645**, 307–314 (2009). [CrossRef] [PubMed]

**13. **E. Kirilina, A. Jelzow, A. Heine, M. Niessing, H. Wabnitz, R. Brühl, B. Ittermann, A. M. Jacobs, and I. Tachtsidis, “The physiological origin of task-evoked systemic artefacts in functional near infrared spectroscopy,” Neuroimage **61**(1), 70–81 (2012). [CrossRef] [PubMed]

**14. **Y. Zhang, D. H. Brooks, M. A. Franceschini, and D. A. Boas, “Eigenvector-based spatial filtering for reduction of physiological interference in diffuse optical imaging,” J. Biomed. Opt. **10**(1), 011014 (2005). [CrossRef] [PubMed]

**15. **J. Virtanen, T. Noponen, and P. Meriläinen, “Comparison of principal and independent component analysis in removing extracerebral interference from near-infrared spectroscopy signals,” J. Biomed. Opt. **14**(5), 054032 (2009). [CrossRef] [PubMed]

**16. **I. Tachtsidis, P. H. Koh, C. Stubbs, and C. E. Elwell, “Functional optical topography analysis using statistical parametric mapping (SPM) methodology with and without physiological confounds,” Adv. Exp. Med. Biol. **662**, 237–243 (2010). [CrossRef] [PubMed]

**17. **K. Friston, J. Ashburner, S. Kiebel, T. Nichols, and W. Penny, *“**Statistical Parametric Mapping: The Analysis of Functional Brain Images,”* (Elsevier, 2007).

**18. **M. L. Schroeter, M. M. Bücheler, K. Müller, K. Uludağ, H. Obrig, G. Lohmann, M. Tittgemeyer, A. Villringer, and D. Y. von Cramon, “Towards a standard analysis for functional near-infrared imaging,” Neuroimage **21**(1), 283–290 (2004). [CrossRef] [PubMed]

**19. **J. C. Ye, S. Tak, K. E. Jang, J. Jung, and J. Jang, “NIRS-SPM: statistical parametric mapping for near-infrared spectroscopy,” Neuroimage **44**(2), 428–447 (2009). [CrossRef] [PubMed]

**20. **E. Okada and D. T. Delpy, “Near-infrared light propagation in an adult head model. ii. Effect of superficial tissue thickness on the sensitivity of the near-infrared spectroscopy signal,” Appl. Opt. **42**(16), 2915–2922 (2003). [CrossRef] [PubMed]

**21. **G. E. Strangman, Q. Zhang, and Z. Li, “Scalp and skull influence on near infrared photon propagation in the Colin27 brain template,” Neuroimage **85**(Pt 1), 136–149 (2014). [CrossRef] [PubMed]

**22. **C. J. Holmes, R. Hoge, L. Collins, R. Woods, A. W. Toga, and A. C. Evans, “Enhancement of MR images using registration for signal averaging,” J. Comput. Assist. Tomogr. **22**(2), 324–333 (1998). [CrossRef] [PubMed]

**23. **L. Gagnon, C. Gauthier, R. D. Hoge, F. Lesage, J. Selb, and D. A. Boas, “Double-layer estimation of intra- and extracerebral hemoglobin concentration with a time-resolved system,” J. Biomed. Opt. **13**(5), 054019 (2008). [CrossRef] [PubMed]

**24. **F. Scarpa, S. Brigadoi, S. Cutini, P. Scatturin, M. Zorzi, R. Dell’acqua, and G. Sparacino, “A reference-channel based methodology to improve estimation of event-related hemodynamic response from fNIRS measurements,” Neuroimage **72**, 106–119 (2013). [CrossRef] [PubMed]

**25. **G. H. Glover, “Deconvolution of impulse response in event-related BOLD fMRI,” Neuroimage **9**(4), 416–429 (1999). [CrossRef] [PubMed]

**26. **S. Prahl, “Optical absorption of hemoglobin,” (2013) http://omlc.ogi.edu/spectra/hemoglobin/index.html.

**27. **J. R. Mourant, T. Fuselier, J. Boyer, T. M. Johnson, and I. J. Bigio, “Predictions and measurements of scattering and absorption over broad wavelength ranges in tissue phantoms,” Appl. Opt. **36**(4), 949–957 (1997). [CrossRef] [PubMed]

**28. **A. Torricelli, A. Pifferi, P. Taroni, E. Giambattistelli, and R. Cubeddu, “In vivo optical characterization of human tissues from 610 to 1010 nm by time-resolved reflectance spectroscopy,” Phys. Med. Biol. **46**(8), 2227–2237 (2001). [CrossRef] [PubMed]

**29. **F. Martelli, S. Del Bianco, A. Ismaelli, and G. Zaccanti, *Light Propagation through Biological Tissue and Other Diffusive Media: Theory, Solutions and Software* (SPIE Press, 2010).

**30. **L. Zucchelli, D. Contini, R. Re, A. Torricelli, and L. Spinelli, “Method for the discrimination of superficial and deep absorption variations by time domain fNIRS,” Biomed. Opt. Express **4**(12), 2893–2910 (2013). [CrossRef] [PubMed]

**31. **D. Contini, L. Spinelli, M. Caffini, R. Cubeddu, and A. Torricelli, “A multichannel time-domain brain oximeter for clinical studies,” Proceedings of SPIE-OSA Biomedical Optics7369, 1D (2009).

**32. **E. Visani, L. Canafoglia, I. Gilioli, D. R. Sebastiano, V. E. Contarino, D. Duran, F. Panzica, R. Cubeddu, D. Contini, L. Zucchelli, L. Spinelli, M. Caffini, E. Molteni, A. M. Bianchi, S. Cerutti, S. Franceschetti, and A. Torricelli, “Hemodynamic and EEG Time-Courses During Unilateral Hand Movement in Patients with Cortical Myoclonus. An EEG-fMRI and EEG-TD-fNIRS Study,” Brain Topogr.in press. [PubMed]

**33. **R: A language and environment for statistical computing. R Foundation for Statistical Computing,” Vienna, Austria. http://www.R-project.org/.

**34. **J. A. Hartigan and M. A. Wong, “A K-means clustering algorithm,” Appl. Stat. **28**(1), 100–108 (1979). [CrossRef]

**35. **A. Struyf, M. Hubert, and P. Rousseeuw, “Clustering in an object-oriented environment,” J. Stat. Softw. **1**(4), 10630 (1997).

**36. **T. B. Kuo, C. M. Chern, W. Y. Sheng, W. J. Wong, and H. H. Hu, “Frequency Domain Analysis of Cerebral Blood Flow Velocity and Its Correlation With Arterial Blood Pressure,” J. Cereb. Blood Flow Metab. **18**(3), 311–318 (1998). [CrossRef] [PubMed]

**37. **H. Obrig, M. Neufang, R. Wenzel, M. Kohl, J. Steinbrink, K. Einhäupl, and A. Villringer, “Spontaneous low frequency oscillations of cerebral hemodynamics and metabolism in human adults,” Neuroimage **12**(6), 623–639 (2000). [CrossRef] [PubMed]

**38. **Y. Tong and B. D. Frederick, “Time lag dependent multimodal processing of concurrent fMRI and near-infrared spectroscopy (NIRS) data suggests a global circulatory origin for low-frequency oscillation signals in human brain,” Neuroimage **53**(2), 553–564 (2010). [CrossRef] [PubMed]

**39. **D. Canova, S. Roatta, D. Bosone, and G. Micieli, “Inconsistent detection of changes in cerebral blood volume by near infrared spectroscopy in standard clinical tests,” J. Appl. Physiol. **110**(6), 1646–1655 (2011). [CrossRef] [PubMed]

**40. **L. Gagnon, M. A. Yücel, M. Dehaes, R. J. Cooper, K. L. Perdue, J. Selb, T. J. Huppert, R. D. Hoge, and D. A. Boas, “Quantification of the cortical contribution to the NIRS signal over the motor cortex using concurrent NIRS-fMRI measurements,” Neuroimage **59**(4), 3933–3940 (2012). [CrossRef] [PubMed]

**41. **T. Takahashi, Y. Takikawa, R. Kawagoe, S. Shibuya, T. Iwano, and S. Kitazawa, “Influence of skin blood flow on near-infrared spectroscopy signals measured on the forehead during a verbal fluency task,” Neuroimage **57**(3), 991–1002 (2011). [CrossRef] [PubMed]

**42. **B. Hallacoglu, A. Sassaroli, M. Wysocki, E. Guerrero-Berroa, M. Schnaider Beeri, V. Haroutunian, M. Shaul, I. H. Rosenberg, A. M. Troen, and S. Fantini, “Absolute measurement of cerebral optical coefficients, hemoglobin concentration and oxygen saturation in old and young adults with near-infrared spectroscopy,” J. Biomed. Opt. **17**(8), 081406 (2012). [CrossRef] [PubMed]