## Abstract

In this paper, a problem of rational splitter and OLT card installation in fiber-to-the-home (FTTH) networks is considered. We assume that the most tedious part of FTTH network deployment, i.e., trenching and cable installation, is completed, and the operator can start to connect customers to the network. The connecting process still requires expenditures for both passive (splitters) and active (ONUs, OLT cards) equipment. In this paper, we define the problem of rational splitter and OLT card deployment and present the multi-state optimization (MuSO) approach to handle it. MuSO minimizes the expected total cost of deployment using a stochastic model; thus, it optimizes under stochastic behavior. Finally, we compare MuSO to other rational methods, numerically validating its efficiency. We also investigate the impact of various factors—e.g., the take-up rates assumed when designing a network, delays in providing service to newly acquired customers, optimization time limits—on the final FTTH network deployment cost. The presented methodology not only can be used to minimize the deployment cost of FTTH networks, but also to assess the impact of uncertainty on network designs returned by automated methods.

© 2017 Optical Society of America

## I. Introduction

Contemporary telecommunication transport infrastructures consist of interconnected networks that are, according to their roles, classified to access, aggregation, and core segments. Networks of either the core or the aggregation segment—thanks to the exploitation of high-capacity DWDM (WDM) optical systems—are capable of handling extremely large volumes of highly aggregated IP traffic and practically do not restrict the bandwidth that could be delivered to the end users. Surprisingly, despite the fact that the high-capacity optical transport equipment installed in there is extremely expensive, its cost does not influence much the fees paid by the final customers. The reason is that the relatively sparse resources of these networks are shared by huge numbers of users. The opposite becomes true in the case of access segment networks. The closer the network resources are to the end users’ premises, the more numerous and more dedicated they become and the greater the chunk of the total network cost they generate. The implication is that every deployment in the access network that affects a large group of end users is a costly and long-term investment that has to be economically motivated and carefully planned.

Incumbent local exchange carriers (ILECs) entered the Internet era with access networks comprised of huge numbers of copper-pair cables suitable for the transmission of voice signals in the telephone bandwidth. When expectations for high-speed Internet access arose, ILECs invested large amounts of money to develop the digital subscriber line (DSL) technique that would provide the possibility of effectively transferring data over their existing copper access infrastructure between the head-end node (DSLAM) and DSL modems at customer premises. As traffic collected by a DSLAM device is relatively low, a further aggregation step is required that would connect a set of DSLAM devices to metro networks and, finally, to the broadband gateway nodes (BGNs) of the core networks.

Nowadays, every such solution exploits transmission over optical fibers. The consecutive generations of DLS (ADSL, HDLS, VDSL) require the length of the copper segment to be continuously shortened—as DSLAM nodes need to approach closer to the customer premises. The solutions, generally referred to as fiber to the $x$-location (FTTx) networks, evolved from FTTN, through FTTC and FTTB to FTTH (the last letter represents, respectively, node, curb, building, and, finally, the home of the end user). In the last (FTTH) case, the DSL part is definitively removed from the access network, as fibers directly enter optical modems (ONTs) situated at customer premises (homes).

High-speed access to the Internet network is considered a necessary condition that enables the economic growth of regions. In densely populated areas, where a high return on investment (ROI) is expected, FTTx solutions are deployed by both ILECs and competitive local exchange carriers (CLECs). In the latter case, a CLEC can build either FTTN/FTTC networks, taking advantage of the existing copper pairs that—due to local loop unbundling laws—must be exposed by ILECs, or FTTB/FTTH networks where they directly enter customer premises with the fiber. Oppositely, in low-population and especially remote areas, business incentives seem not to work—the deployment of FTTx networks there is enforced by law (mandatory obligations for operators) or subsidized by local municipalities.

The FTTH network optimization and design is a very broad field that can be entered from many directions, using various techniques based on mathematical programming [1–4], pure heuristics [5], or a mixture of the above [6]. A comprehensive survey on various FTTH-related network optimization problems can be found in Ref. [7], while an engineering approach to FTTH network design is neatly presented in Ref. [8].

One of fundamental problems in FTTH deployment is uncertainty. Various aspects of the deployment design can suffer from it, starting from material or labor costs and ending at unexpected real estate investments or unpredictable take-up rates. The majority of the research published in the field of FTTH design either does not consider uncertainty or takes it into account only in preliminary design phases, i.e., demands are overestimated, or expected material costs are exaggerated. Only a handful of researchers incorporates uncertainty directly into their design problems, tackling it in various ways, e.g., by addressing the future growth of demand volumes [9], using a polyhedral model to express demand volumes [10], or even applying game theoretical approaches [11].

We expect that several novel approaches to uncertainty in FTTH network design will appear in the future. However, the actual way FTTH networks are deployed in practice leaves us without any method allowing for an efficient comparison (in terms of costs) of different network designs when uncertainty is involved, because the cost heavily depends on the deployment strategy—the same initial network designs can generate significantly different costs, depending on the way they are deployed. The mentioned papers [9–11] incorporate uncertainty into the design phase and return optimized network designs that should be deployed. However, FTTH networks are deployed in two steps. First, the core of the network is built and consists of cabinets and fiber cables that connect the cabinets to central locations accommodating OLTs. When all the targeted areas are within the reach of the deployed core network, the advertising campaign begins, and newly acquired customers are connected ad hoc. This means that a part of the equipment that is relatively expensive but can be deployed almost immediately, namely, splitters and OLT cards, is deployed only when particular customers decide to connect. The methods introduced in this paper take into account the presented nature of FTTH network deployment and are targeted to allow for evaluating network designs when uncertainty is present by proposing a rationalized way of deploying optimized FTTH network designs obtained from other sources.

In our research, we assume that an FTTH network has been designed and is now being deployed. The deployment has to follow the design; thus, there is no place for radical changes in the network plan. We assume that the network has been designed in such a way that all potential clients can connect to the network. Notice that it does not mean that a 100% take-up rate was assumed during the design of the network (see Ref. [12]). Still, if the network can accommodate 100% of potential clients, this means during normal operations, the network will most probably be underutilized. Therefore, to generate savings and improve the cash flow, not all elements of the network have to be immediately deployed. In our research, we assume that all trunk and distribution cables are deployed from the beginning, but the deployment of OLTs, drop cables, and splitters can be delayed. We assume that clients will be connected to the network in groups, reflecting the fact that newly acquired clients can wait a couple of days for service. It allows us to connect clients (deploying drop cables and installing relevant splitters and OLT cards if needed) knowing other clients are also willing to connect. Still, the number and locations of future clients remain unknown.

Consider the simple example presented in Fig. 1. Two multi-dwelling units (MDUs) are connected to a central office, each having a dedicated access point that can host one splitter with 64 outputs and one splitter with 32 outputs. Assume that each MDU has 90 flats, the take-up rate is 100%, and the utilized FTTH technology allows for a maximum split of 64. In such a case, all splitters, which have been located in the access points in the design, should be deployed. Additionally, one two-output splitter should be installed in the distribution point to connect the 32-output splitters installed in the access points to a single OLT port; thus, three OLT ports should be utilized in the central office. On the other hand, when the take-up rate is 2/3, only 64-output splitters are needed and two OLT ports should be utilized. Finally, with the take-up rate equal to 1/3, only 32-output splitters (together with a two-output splitter in the distribution point) and one OLT port should be used. Notice that we do not rebuild or re-plan our network, but just decide which splitter from the followed network plan to use.

The example gets much more complicated when uncertainty is taken into account. Consider the probability mass functions presented in Fig. 2. For the left MDU, there are ten equally possible final states; thus, each of the final states has a probability of 0.1. On the other hand, there are 45 equally possible final states for the right MDU; thus, the probability of each of them equals 0.0(2). Considering the MDUs separately, it seems that a 32-output splitter should be installed in the left MDU (which, in turn, requires installation of a 2-output splitter in the distribution point), while in the right MDU, a 64-output splitter should be installed first, and if the actual number of clients exceeds 64, a 32-output splitter should be added. This approach would minimize the expected cost of the splitters and OLT ports from an egoistic point of view of a single access point always ending with two utilized OLT ports: one for the 64-output splitter and one for the 32-output splitter(s). However, when both MDUs are considered jointly, the splitter installation strategy implemented in the right MDU should be different. As the left MDU always has to use its 32-output splitter, using a 32-output splitter designed for the right MDU does not incur any additional costs related to OLT ports: both 1:32 splitters are connected to the same two-output splitter at the distribution point. Notice that we cannot connect the 32-output splitter directly to an OLT port, as it would not be coherent with the followed network plan. Therefore, a better strategy for the right MDU is to install a 32-output splitter first and add a 64-output splitter if needed. Assuming the splitter costs indicated in Table I, the expected cost of the utilized splitters in the former approach is 270 (one $1:32$ splitter, one $1:2$ splitter, and one $1:64$ splitter will always be installed; the second $1:32$ splitter will be installed in 1/9 cases), while in the latter approach, it is more than 325 (two $1:32$ splitters and one $1:2$ splitter will always be installed; one 1:64 splitter will be installed in 37/45 cases). However, assuming that a single OLT port costs 1,000 (see Table I), the expected cost of the OLT ports in the former approach is 2,000, while in the latter approach, it is less than 1,823, because in 8/45 cases, the second OLT port will not be needed. Summarizing, the latter approach is more than 5% cheaper on average when both the splitter and OLT costs are taken into account.

In this paper, the following problem is considered. Assume a scenario in which a network provider builds an FTTH network. The deployment follows a detailed plan that specifies exact ways each customer should be connected, including all cables and splitters that have to be installed. Customers are connected to access points, and from the viewpoint of an access point, all its customers are indistinguishable; thus, there is some elasticity left while connecting arriving customers. The designed FTTH network is being deployed, and the first part of the process is already finished, i.e., cables and cabinets are deployed, and the exact locations of potential splitters are known. In the second part of the deployment, customers have the possibility to connect to the network. If they decide to connect, an appropriate combination of splitters and OLT ports from the followed network plan has to be deployed in order to provide service to the newly arriving customers. Our goal is to select the best possible combination of equipment that minimizes the final total cost of the network.

This paper is organized as follows. In Section II, the considered problem is formally presented. An efficient solution to the problem is presented in Section III. It is followed by the numerical results in Section IV. The paper concludes in Section V.

## II. Formalized Problem

In this section, the problem briefly described in Section I is formalized. Still, we use the example of the previous section to facilitate explanations of the mathematical model.

Sets:

$\mathcal{L}$ | locations; |

$\mathcal{A}$ | access points; $\mathcal{A}\subset \mathcal{L}$; |

$\mathcal{P}$ | equipment types; |

$\mathcal{E}$ | installations; the term installation $e$ means a pair $e=(l,p)$, where $l\in \mathcal{L}$ and $p\in \mathcal{P}$; |

${\mathcal{E}}_{a}$ | installations available in access point $a$; ${\mathcal{E}}_{a}\subset \mathcal{E}$, $a\in \mathcal{A}$; |

${\mathcal{E}}_{e}$ | installations connected to installation $e$; ${\mathcal{E}}_{e}\subset \mathcal{E}$, $e\in \mathcal{E}$; |

$\mathcal{C}$ | clients; |

$\mathcal{T}$ | ordered set of time periods. |

In the example, set $\mathcal{L}$ consists of four locations: two access points, one distribution point, and one central point. Set $\mathcal{A}$ consists of two (left and right) access points. Set $\mathcal{P}$ consists of five elements: three splitters ($1:2$, $1:32$, and $1:64$), one OLT port, and one OLT. Installation set $\mathcal{E}$ contains all the splitters depicted in Fig. 1 in their respective locations. Additionally, it contains an OLT port that can be used multiple times (see the following explanations) and an OLT; thus, the whole set consists of seven elements. There are two subsets ${\mathcal{E}}_{a}$ of the installation set, each corresponding to a single access point. Each of the subsets consists of two splitters that are available in the corresponding access point. Finally, there are three subsets ${\mathcal{E}}_{e}$. The first, namely, ${\mathcal{E}}_{\mathrm{OLT}}$, contains all installations directly connected to the OLT, which is the OLT port. The second, namely, ${\mathcal{E}}_{\mathrm{OLT}\_\text{port}}$, contains all installations directly connected to the OLT port, which are the 64-output splitters located in the access points and the $1:2$ output splitter located in the distribution point. The last subset, namely, ${\mathcal{E}}_{1:2}$, contains all elements connected to the $1:2$ output splitter, which are the 32-output splitters located in the access points. The complete setup is indicated in Fig. 3.

Set $\mathcal{C}$ contains all the arriving clients. In this research, set $\mathcal{C}$ is unknown and can only be predicted—the issue of the uncertainty will be described in detail later in this paper. Currently, we temporarily assume that set $\mathcal{C}$ is given. The last introduced set, i.e., $\mathcal{T}$ has not been discussed yet. It contains time periods for which decisions have to be made. It will be discussed in detail in the remaining part of this section.

Constants:

$o(e)$ | number of outputs provided by installation $e$; $e\in \mathcal{E}$; |

$\xi (e)$ | cost of installation $e$; $e\in \mathcal{E}$; |

$u(e)$ | maximum number of installations $e$; $e\in \mathcal{E}$; |

$d(c,a)$ | 1 if client $c$ is connected to access point $a$; 0 otherwise; $c\in \mathcal{C}$, $a\in \mathcal{A}$; |

$t(c)$ | time period when client $c$ is being connected; $t(c)\in \mathcal{T}$. |

The first constant is self-explaining for splitters. For an OLT port, it equals 1, as one port can provide a signal to only one splitter. For an OLT, it equals the number of its ports, e.g., 8 or 16. The second constant is also self-explaining. However, notice that some elements can be of zero cost, e.g., if an OLT is non-modular and always contains a fixed number of ports, the cost of its ports equals zero. The third constant depends on the followed network plan and is equal to the number of instances of a particular equipment type located in a particular (single) location. In the example, it equals 1 for all installations except for the OLT port, for which it equals 3.

Constants $d(c,a)$ and $t(c)$ result from an assumed client arrival process. For instance, we can assume that $|\mathcal{C}|$ follows the Poisson distribution, with the mean value corresponding to a 0.3 take-up rate, and the arrivals are uniformly distributed among the time periods from $\mathcal{T}$. Other assumptions concerning those values are also possible—the presented methodology is not restricted to the Poisson distribution only.

Variables:

In the problem, the installation order of equipment is optimized. For this purpose, variables denoting which installation is deployed in which time period are needed. The role is assumed by variables ${x}_{et}$, allowing us to formulate the problem as follows:

The optimization goal is to minimize the final cost of deployment. Therefore, in objective Eq. (1a), only variables that denote the final state of the deployment, i.e., when $t=\mathrm{max}\text{\hspace{0.17em}}\mathcal{T}$, are taken into account. Such a simplification is a result of an assumption denoted in Eq. (1e) stating that deployed installations cannot be removed. Another three assumptions are as follows: there should be enough resources to provide signals to all the clients that have already been connected [Eq. (1b)]; if a deployed installation depends on another installation, then another installation has to be deployed [Eq. (1c)]; and finally, the number of deployed installations is limited by the followed network plan [Eq. (1d)].

The problem is $\mathcal{NP}$-complete, because the set cover problem [13] can be reduced to it in polynomial time. Consider that the set to cover is the set of access points, and possible covering subsets are represented by splitters located in the central office that are to be connected to fibers reaching access points representing elements in the covering sets. Solving the problem of reaching each access point with at least one fiber using the minimum number of splitters is equivalent to solving the set cover problem.

Although $\mathcal{NP}$-complete, the problem presented in Eq. (1) seems not to be extremely difficult—it is possible to remove indices $t$ from it and optimize the deployed installations only for the final state. However, notice that we have temporarily assumed that set $\mathcal{C}$ is known in advance. In fact, the clue of the problem is that both set $\mathcal{C}$ and constants $d(c,a)$ and $t(c)$ are subject to uncertainty. This means that the problem considered in this paper consists of multiple stages, and variables ${x}_{et}$ have to be set knowing the exact values of $d(c,a)$ and $t(c)$ only for clients that have appeared during or before time period $t$. The actual size of set $\mathcal{C}$ and the values of $d({c}^{\prime},a)$ and $t(c)$ for other clients have to be predicted. The whole process is presented in Algorithm 1.

Such a problem can be efficiently solved using dynamic programming when the access nodes are not related [14]. However, when relations between access nodes are present, e.g., they can use the same OLT port, using the approach of Ref. [14] is problematic. The approach evaluates $|\mathcal{V}|!$ different states, where $|\mathcal{V}|$ is the number of splitters in an access node. When relations between access points exist, the number of states would increase to $\prod _{a\in \mathcal{A}}|{\mathcal{C}}_{a}|$, where $|{\mathcal{C}}_{a}|$ is the number of clients in access point $a$. Notice that in this case, we have to consider the number of clients, not the number of splitters, because decisions have to be made whenever any of the access points is saturated, and those events are unrelated. Using the approach to solve a practical problem of 100 access points, 100 clients each would require evaluating ${100}^{100}$ states, which is undoubtedly impossible. Still, the problem can be approached using the novel and efficient method presented in the following section.

## III. Solutions

This section is divided into two parts. In the first subsection, benchmark approaches are presented. We consider a wide spectrum of various techniques, starting from simple rules, through greedy approaches, and ending at the MIP-facilitated optimization for a centroid of the uncertainty set. In the second subsection, our novel approach is presented. It is an MIP-based method that considers only a small but representative subset of final network states that are present in the uncertainty set. The method is called multi-state optimization (MuSO).

#### A. Benchmark Approaches

### 1) Local Approach:

One of the simplest approaches is to take only the local information into account. There is a given set of splitters that can be installed in each access point. Each time a splitter has to be installed, we can either always select the cheapest splitter or a splitter with the lowest cost per output, which is usually the largest available splitter. If a network is properly designed, i.e., the take-up rate assumed during the design phase is correct, then the first approach is very inefficient—the approach usually ends with all the available splitters installed. Therefore, in our experiments, we use the second approach, i.e., selecting a splitter with the lowest cost per output, as one of benchmarks. When using the utilized local approach to solve the example of Fig. 1, in both MDUs, $1:64$ splitters will be installed first, because they are cheaper in terms of cost per output than $1:32$ splitters.

### 2) Greedy Approach:

In the greedy approach, the option that is the cheapest when making the decision is always selected. If the splitters in an access point were always directly connected to an OLT port, the greedy approach would be exactly the local approach selecting the cheapest splitter, which, as noted, is extremely inefficient. However, in our problem, the splitters are not always directly connected to an OLT port. If splitter ${s}_{1}$ is connected to splitter ${s}_{2}$ that has already been deployed, installing splitter ${s}_{1}$ will not require deploying another OLT port. Therefore, splitters that, according to the network plan, are connected to already deployed splitters are much cheaper than splitters that require deploying a new OLT port; thus, they are prioritized during the selection. The approach is selected as the second benchmark. When using the greedy approach to solve the example of Fig. 1, in both MDUs, $1:32$ splitters will be installed first because their total installation cost is smaller. Notice that the installation cost of the first splitter equals the sum of costs of a $1:32$ splitter, a $1:2$ splitter, and an OLT port. The installation cost of the second splitter equals solely the cost of a $1:32$ splitter, because the rest of the necessary equipment has been already installed (to support the first splitter).

### 3) Centroid Approach:

The most advanced benchmark approach used in this research is the centroid approach. In this approach, for each $t\in \mathcal{T}$, Eq. (1) is solved by taking into account the expected values for uncertain sets and constants. The approach will be explained in detail later, because a formal explanation of the approach requires a notation that has not been presented yet.

#### B. Multi-State Optimization

In the MuSO approach, we consider a subset of possible final states and optimize the selection using the MIP methodology. The idea is to make current decisions, taking into account a number of plausible final states. We simultaneously optimize the final cost of a network in all considered final states, taking the weighted cost of the obtained results as the expected final cost of a network. The ways demands of different states are met are almost independent of each other—installations deployed in two different final states can be significantly different. However, in all final states, the installations that are deployed in the considered time period must be present in the final deployments of a network for all final states.

To formally explain the approach, the notation presented in Section II has to be extended as follows.

Sets:

By the term final state, we understand a possible content of set $\mathcal{C}$ and values of constants $d(c,a)$ and $t(c)$. The set of all possible final states depends on assumptions concerning the number of clients and can be very large. In the worst case, the number of all possible final states can be as large as $\prod _{a\in \mathcal{A}}|\mathcal{T}|\xb7|{\mathcal{C}}_{a}|$, where $|{\mathcal{C}}_{a}|$ is the number of potential clients in access point $a$. In MuSO, only a subset of those final states is taken into account. The subset is denoted by $\mathcal{S}$.

Constants:

${t}^{\star}$ | considered time period; ${t}^{\star}\in \mathcal{T}$; |

$g(a)$ | number of clients in access point $a$ in time period ${t}^{\star}$; $g(a)=\sum _{c\in \mathcal{C}\text{:}t(c)\le {t}^{\star}}d(c,a)$; |

$l(e)$ | number of already deployed installations $e$; $e\in \mathcal{E}$; $l(e)={\mathrm{max}}_{t\in \mathcal{T}\text{:}t<{t}^{\star}}{x}_{et}$; |

$b(a,s)$ | final number of clients in access point $a$ in final state $s$; $a\in \mathcal{A}$, $s\in \mathcal{S}$; |

$p(s)$ | probability of final state $s$; $\sum _{s\in \mathcal{S}}p(s)=1$. |

As explained in Algorithm 1, the whole optimization process is repeated for every $t\in \mathcal{T}$. The time period considered in a current loop of the process is depicted by ${t}^{\star}$. The fact that the process consists of multiple stages results in a situation in which some decisions have already been taken, while other decisions are still to be made. This limitation is represented by constants $l(e)$, which are directly derived from variables ${x}_{et}$ that have already been set, i.e., for $t<{t}^{\star}$. Two totally new constants are $b(a,s)$ and $p(s)$. They are both related to the final states and together, they describe their main features. Constant $b(a,s)$ depicts the sizes of the demands. Finally, constant $p(s)$ depicts the probability of the corresponding final state.

Variables:

In our approach, we optimize both the current and possible future decisions represented by the previously introduced variables ${x}_{e{t}^{\star}}$ and new variables ${y}_{es}$, respectively. The possible future decisions are related to the potential future states of the network. The optimization model is as follows:

In Eq. (2a), the average deployment cost for all considered final states is optimized. It is a weighted average over total costs for all considered final states. In each final state, an adequate amount of installations has to be deployed that satisfies all demands. It is enforced by Eqs. (2b) and (2c), which are similar in their meaning to Eqs. (1b) and (1c), respectively. The final deployments of a whole network in different final states represented by variables ${y}_{es}$ are loosely connected. The only two restrictions are that they cannot use more installations than in the original network plan, which is enforced by Eq. (2d), and that they have to be compatible with a network state which will be deployed in a current time period represented by variables ${x}_{e{t}^{\star}}$, which is enforced by Eq. (2e). As for the current decisions ${x}_{e{t}^{\star}}$, they have to result in providing the signal to all the clients that have arrived so far, which is enforced by Eq. (2f), and they have to be compatible with the existing network state, which is enforced by Eq. (2g).

Notice that the presented model is a stochastic model that minimizes the total expected cost. Such an approach can be of no use for institutions with a tight budget which have to be certain that the investment will not exceed the budget. Luckily, Eq. (2) can be easily transformed to a robust model by assuming that the optimization is performed for one final state which consists of the smallest number of clients in each access point that dominates all generated final states. Alternatively, we can minimize the maximum possible cost by modifying Eq. (2a).

The complete MuSO method is depicted in Algorithm 2. It is invoked whenever decisions have to be made, i.e., for each $t\in \mathcal{T}$.

In Step 1, a number of plausible final states are generated. The efficiency of the process strictly depends on the available knowledge and assumptions concerning the future development of the situation. We assume that a probabilistic model of clients’ behaviors is available—it can be considered as a black box that produces possible final states, returning more probable states more often. The considered clients’ behavior problem is repeatable, because hundreds or even thousands of access points are covered in large deployments. Therefore, we believe that the best way to obtain clients’ behavior models (the black box) is to use machine learning algorithms on data from past deployments. If the data are not available, more theoretical models can be used instead. For instance, we can assume that the number of clients in each access point is modeled by a normal distribution, and their arrival times are modeled using a logistic function, which is commonly used to model diffusion of innovations [15]. The black box, whenever asked, returns one final state; thus, we can produce plausible final states that can be possible continuations of a current state by continuously generating final states from scratch using the black box and discarding those that are incoherent with the current network state. Notice that we do not require the actual final state to be in the set of plausible final states. In fact, taking into account that the number of possible final states is huge, with great probability, we can assume that the actual final state will never be in the set of plausible final states. Still, a state that is the average over all plausible final states should converge to a state that minimizes the mean squared error.

In Step 2, Eq. (2) is solved. In practical applications, the problem can usually be handled by commercial MIP solvers in an acceptable amount of time (less than 10 min). However, when a lot of installations have to be deployed, e.g., when ${t}^{\star}=1$, even commercial solvers can have problems providing an optimal solution. In such cases, we minimize the optimality gap by improving the upper bound during half of the available time using the built-in functionality of the utilized solver to polish the current incumbent solution.

In Step 3, the obtained values of ${x}_{e{t}^{\star}}$ are trimmed. Taking a closer look at the model presented in Eq. (2), one can notice that variables ${x}_{e{t}^{\star}}$ are not in the objective function; thus, they are not directly minimized in the optimization process. They have to be high enough to satisfy Eqs. (2f) and (2g). On the other hand, they have to be low enough to satisfy Eq. (2e). If those constraints are far from each other, variables ${x}_{e{t}^{\star}}$ can assume a range of feasible and equally optimal values from the viewpoint of Eq. (2). However, it does not mean that those ranges are equally optimal from the point of view of Algorithm 1. In Eq. (2), only a subset of potential final states is taken into account. Therefore, it is possible that in this subset, there will be only final states that overestimate the final number of clients in some access points. It can lead to a situation in which a number of installations are deployed in advance, i.e., at the moment of deployment, they do not provide a signal to any of the clients. In reality, there is always a chance that there will be no clients appearing in the future. Therefore, we should never deploy more equipment than needed in a considered time period. The trimming method is presented in Algorithm 3.

The procedure is repeated for each access point. In Step 2, the number of possible new installations, denoted by $n(e)$, is calculated. The following formula is used, stating that only installations used in all considered final states are taken into account in the following steps of the algorithm:

In Step 3, we calculate the number of outputs that are needed to satisfy current demands. The constant is denoted by $m(a)$, and for each $a\in \mathcal{A}$, it is obtained using the following formula: In Step 4, the obtained constants $n(e)$ and $m(a)$ are used to calculate the final trimmed results. The results are obtained solving the classical knapsack problem [16] using the dynamic programming [17]. We use a lexicographic objective function that first minimizes the number of utilized splitters and second maximizes the number of installed outputs. Notice that there is no strict relation between the finally obtained installations and variables ${x}_{e{t}^{\star}}$, i.e., we can finally deploy more, less, or exactly the same amount of installations, as indicated by obtained values of variables ${x}_{e{t}^{\star}}$.The MuSO algorithm is non-polynomial, because it requires solving MIP problems. In practice, the complexity of solving those MIP problems constitutes a dominant component for the complexity of the whole method. In theory, the complexity of generating plausible final states can also be non-polynomial. However, it depends on the selected models of clients’ behavior and the utilized generating algorithms.

Finally, notice that the centroid approach is a special case of MuSO, in which there is only one final state that represents the centroid of the uncertainty set.

## IV. Experiments

The presented methods have been implemented in C#, compiled using Visual Studio, and run under Windows Server 2012. All experiments were run on a Fujitsu RX200 server with two Intel Xeon E5-2620v2 6C/12T 2.10 GHz processors and 80 GB of RAM. CPLEX 12.5.1 has been used to solve the MIP problems.

#### A. Test Cases

In the experiments, we used actual real-world network designs obtained for a small city inhabited by more than 250k potential customers grouped into almost 11k access points. The designs were generated by an automated FTTH optimization tool presented in Ref. [6]. When generating the designs, expected take-up rates of 0.3 and 0.5 were assumed. Still, thanks to mechanisms described in Ref. [12], the obtained designs were able to provide service to all potential customers.

The whole area was covered by 11 (for a 30% assumed take-up rate) or 13 (for a 50% assumed take-up rate) central locations, forming 11 or 13 FTTH trees. The number of access points in each FTTH tree is depicted in Fig. 4. As indicated in the figure, when the assumed take-up rate increases, the number of central locations also increases, and the obtained FTTH trees are smaller.

The costs of the splitters are presented in Table I. Notice that the presented costs include labor. Therefore, for the sake of simplicity, we assume that a raw fiber also requires installing a splitter ($1:1$ splitter in this case) to make it usable. We assumed that the cost of a single OLT port is significantly larger than the cost of a splitter, and it equals 1,000.

We use the following set of numbers of time periods $|\mathcal{T}|\in \{4,12,52,252\}$ that corresponds to the number of quarters, months, weeks, and working days,^{1} respectively.

Finally, we assumed that the distribution of the number of clients in each access point followed the normal distribution, with the expected take-up rate equal to 30% or 50%, and the variance equal to 0.3 of the access point’s maximum capacity. We assumed that the take-up rates assumed during the planning process were correct; thus, a 30% (50%) expected take-up rate was used for the network optimized for a 30% (50%) assumed take-up rate. Each access point was assigned an individual logistic function. The function is commonly used to model the diffusion of innovations [15]. In our research, it was used to generate the actual arrival times of the clients. Finally, the arrival times were used to distribute clients among the time periods. Notice that in the research, we treat the clients’ behavior model as a black box; thus, the proposed algorithm can work with any model. Remember that the considered problem is repeatable, and hundreds or even thousands of access points are covered in large deployments. Therefore, we believe that the best way to obtain clients’ behavior models is to use machine learning algorithms on data from past deployments. If the data are not available, more theoretical models, like the one assumed in this research, can be used instead.

Each test was repeated three times for each FTTH tree, using different seeds to generate various client distributions. Standard tests were conducted, assuming $\left|\mathcal{S}\right|=10$ and a 30-min time limit for solving a single MIP problem. In all cases, half of the available time was granted to the standard CPLEX’s B&B, while the remaining time was used to polish the obtained solutions. The standard values were changed in some cases to test their influence on the performance of the algorithm.

#### B. General Efficiency

First, we investigated how the MuSO approach works in comparison to the approaches presented in Subsection III.A. The obtained results are presented in Fig. 5. We tested five approaches: the local, greedy, and centroid approaches presented in Subsection III.A; MuSO presented in Subsection III.B; and the oracle, which is an idealized approach that optimizes decisions by knowing the exact future behavior of clients, i.e., the approach based on the oracle works as if $|\mathcal{T}|=1$, which means that all decisions are optimized simultaneously. Test cases are referred to as $X\mathrm{p}\u2013Y\mathrm{t}$, in which $X$ is the number of time periods, while $Y$ is the planned take-up rate of the utilized network design and also the actual expected take-up rate for clients’ behaviors. Although five approaches have been tested, only three of them are directly depicted in the figures. The reason is that the local approach and the oracle are considered as benchmarks. In all the following results, it is assumed that 0% gain is obtained by the local approach, while 100% gain is obtained by the oracle; thus, the depicted results present the performance of the introduced approaches in comparison to the pair: the local approach and the oracle.

It is clearly indicated in the figure that the MuSO approach outperforms other strategies, allowing for up to 30% more savings in comparison to the greedy approach. Interestingly, the centroid approach proved to be much less efficient than the MuSO approach. Notice that for smaller take-up rates (30%), the centroid approach provided solutions that were worse than the solutions provided not only by the MuSO approach, but also by the greedy approach. The reason is that the centroid approach acts very conservatively—it does not look for opportunities to minimize costs. Consider a simple example with two possible scenarios, A and B, such that A happens twice as often as B, but B allows for some extra savings. Figuratively speaking, the MuSO approach will see that B can generate some extra savings, but it is less probable; the centroid approach will see only the more probable scenario A, while the greedy approach will consider only the most profitable scenario B, regardless of its probability. In some cases, the greedy strategy can be profitable (see the case with the 30% take-up rate). However, in other cases, results can be deplorable, as in the case of the 50% take-up rate.

#### C. Number of Considered Final States

Second, we investigated the impact of the number of considered final states on the gain. The obtained results are presented in Fig. 6. We tested six different numbers of considered final states, ranging from 1 to 20.

The obtained numerical results indicate that increasing the number of considered final states also increases the obtained gain. In our experiments, considering only three random plausible final states resulted in better gains than the gain obtained considering the centroid of the uncertainty set. However, when the number of considered final states was becoming large, i.e., more than ten, the gain generated by the more detailed representation of the uncertainty set (increased number of final states) was limited, probably due to the computational problems arising from the increased size of the considered MIP problems. In fact, the gains obtained for 15 or 20 final states were slightly worse than the gain obtained for just 10 plausible final states.

#### D. Computation Time

We also investigated the runtimes of the proposed MuSO approach. The experiments were performed on the network designed for the 30% take-up rate. We tested the impact of the number of considered final states and the number of time periods on the total runtime. When testing the impact of the number of considered final states, 12 time periods were considered. The remaining parameters assumed the standard values. The results are presented in Fig. 7.

As clearly indicated in the figure, the total runtime significantly increases when the number of considered final states increases. The relation is perfectly visible when a smaller number of final states is taken into account. When a larger number of final states is considered, e.g., 15 or 20, the relation loses its importance due to the limits imposed on the runtimes of a single instance of an MIP problem.

On the other hand, the total runtime decreases with the number of considered time periods. Although the total number of solved MIP problems increases, their complexity becomes significantly smaller. In all test cases, the majority of the runtime was used to generate a solution for the first time period. The complexity of MIP problems related to the remaining time periods was significantly smaller, and, in most cases, the problems were solved almost immediately. As for the MIP problem related to the first time period, its complexity directly depends on the number of clients connecting in that time period. When the number of connecting clients increases, the related MIP problem becomes more complex. Therefore, when the number of time periods increases, the number of clients in the first time period decreases; thus, the total runtime also decreases.

#### E. Different Distributions

Finally, we investigated the impact of the clients’ behavior model on the result. The experiments were performed on the network designed for the 30% take-up rate with 12 time periods. We tested four different actual take-up probability distribution functions, namely: the normal distribution with the variance equal to 0.3 of the capacity of a considered access point and with two different mean take-up rates (30% and 50%) and, similarly, the binomial distribution with 30% and 50% mean take-up rates. The distributions are referred to as $N(30)$, $N(50)$, $B(30)$, and $B(50)$, respectively. All test cases were performed on a network that had been designed assuming a 30% take-up rate. The obtained results are presented in Fig. 8.

As indicated in the figure, in all the tested cases, the MuSO approach exhibited its superiority over the benchmark approaches. In cases with smaller variability, i.e., characterized by the binomial distribution, the results returned by the MuSO were approaching optimality, reaching a gain of almost 90%, while in large variability cases, the gain remained below 50%. Still, in both cases, the advantage generated by the MuSO approach was quite stable, being above 5% against the centroid approach on average.

The greedy approach, although giving adequate solutions for part of the considered test cases, can also be very inefficient, as indicated for the $B(50)$ distribution. This means that to reasonably solve the considered problem, complex methods like the centroid approach or the MuSO approach have to be used. Still, as the complexity of both of these approaches is similar and the MuSO approach consistently gives better results than the centroid approach, the former method seems to be the best option.

## V. Conclusion

In this paper, we considered a decision problem related to the optimal (rational) installation of network equipment, i.e., splitters and OLT ports, under the assumption that customer subscriptions during the considered time horizon adhere to some stochastic arrival process. We split the time horizon into a number of time periods of equal length. In each time period, taking into account the current installation state and some estimations on the future behavior of the arrival process, we decide on equipment that should be deployed in the network. We aim at minimizing the total cost of equipment installed at the end of the time horizon, i.e., in the final state of the arrival process.

In this paper, we defined the problem of the rational splitter and OLT card deployment and presented a method called multi-state optimization, which was able to provide, in a reasonable amount of time, a rational solution for the real-world problem instances. The method was implemented and numerically verified. Its efficiency was compared with a couple of proposed benchmark approaches. We also investigated the impact of various factors on the final FTTH network deployment cost, e.g., the take-up rates assumed when designing a network, delays in providing the service to newly acquired customers (number of time periods), or the number of considered scenarios.

We conclude that our method works and is generally able to deliver results significantly better than the solutions provided by the benchmark approaches.

## Acknowledgment

This research was supported by the National Science Centre, Poland, under grant no. 2014/15/D/ST7/05320 “Open problems in large-scale optical access network design.”

## Footnotes

^{1} | In Poland, in 2016. |

## References

**1. **J. Li and G. Shen, “Cost minimization planning for greenfield passive optical networks,” J. Opt. Commun. Netw., vol. **1**, no. 1, pp. 17–29, 2009. [CrossRef]

**2. **M. Chardy, M.-C. Costa, A. Faye, and M. Trampont, “Optimizing splitter and fiber location in a multilevel optical FTTH network,” Eur. J. Oper. Res., vol. **222**, pp. 430–440, 2012. [CrossRef]

**3. **C. Hervet and M. Chardy, “Passive optical network design under operations administration and maintenance considerations,” J. Appl. Oper. Res., vol. **222**, no. 3, pp. 152–172, 2012.

**4. **A. Bley, I. Ljubić, and O. Maurer, “Lagrangian decompositions for the two-level FTTx network design problem,” EURO J. Comput. Optim., vol. **1**, no. 3, pp. 221–252, 2013. [CrossRef]

**5. **A. Mitcsenkov, G. Paksy, and T. Cinkler, “Geography- and infrastructure-aware topology design methodology for broadband access networks (FTTx),” Photon. Netw. Commun., vol. **21**, no. 21, pp. 253–266, 2011. [CrossRef]

**6. **M. Żotkiewicz, M. Mycek, and A. Tomaszewski, “Profitable areas in large-scale FTTH network optimization,” Telecommun. Syst., vol. **61**, no. 3, pp. 591–608, 2016. [CrossRef]

**7. **M. Grötschel, C. Raack, and A. Werner, “Towards optimizing the deployment of optical access networks,” EURO J. Comput. Optim., vol. **2**, no. 1, pp. 17–53, 2013.

**8. **J. Segarra, V. Sales, and J. Prat, “Planning and designing FTTH networks: Elements, tools and practical issues,” in 14th Int. Conf. on Transparent Optical Networks, Coventry, England, July 2012.

**9. **K. F. Poon and A. Ouali, “A MILP based design tool for FTTH access networks with consideration of demand growth,” in 6th Int. Conf. on Internet Technology and Secured Transactions, Abu Dhabi, United Arab Emirates, Dec. 2011.

**10. **C. Hervet, A. Faye, M.-C. Costa, M. Chardy, and S. Francfort, “Solving the two-stage robust FTTH network design problem under demand uncertainty,” in Int. Network Optimization Conf., Costa Adeje, Spain, May 2013.

**11. **K. Casier, B. Lannoo, J. V. Ooteghem, S. Verbrugge, D. Colle, M. Pickavet, and P. Demeester, “Game-theoretic optimization of a fiber-to-the-home municipality network rollout,” J. Opt. Commun. Netw., vol. **1**, no. 1, pp. 30–42, 2009. [CrossRef]

**12. **M. Żotkiewicz and M. Mycek, “Impact of demand uncertainty models on FTTH network design,” in 18th Int. Conf. on Transparent Optical Networks (ICTON), July 10–14, 2016, pp. 1–4.

**13. **R. M. Karp, “Reducibility among combinatorial problems,” in *Complexity of Computer Computations*, R. E. Miller and J. W. Thatcher, Eds. New York, New York: Plenum, 1972, pp. 85–103.

**14. **M. Żotkiewicz, “Handling uncertain behavior of clients during FTTH splitter allocation,” in INFORMS Telecommunications Conf., Boca Raton, Florida, Mar. 2016.

**15. **E. Rogers, *Diffusion of Innovations*, 5th ed. New York, New York: Simon and Schuster, 2003.

**16. **T. Dantzig, *Numbers: The Language of Science.*Hoboken, New Jersey: Wiley-Interscience, 1930.

**17. **R. Andonov, V. Poirriez, and S. Rajopadhye, “Unbounded knapsack problem: Dynamic programming revisited,” Eur. J. Oper. Res., vol. **123**, no. 2, pp. 394–407, 2000. [CrossRef]