In 2011, a large Dutch national electricity transmission system operator (TSO) approached our consultancy firm for strategic advice and support on the realization of a national fiber network (this was a year before I became involved, so much of the hard, smart work was already done by the time I came around). The TSO was looking to implement failover systems that could switch electrical circuits within 1/200th of a second (5 ms), so that in case of a circuit loss, power could be rerouted without customers noticing. As some of this latency budget is already eaten up by the switching equipment itself, there’s about 3 ms left for communication. With latency requirements like this, fiber is really the only option.
My firm’s first job was to find out what it would cost to build or lease such a network. Like most carrier-grade networks, this one would have to be built using a hierarchical ring topology. Rings are used to ensure that in case one fiber breaks, signals can still reach all connected nodes by simply traveling the ring the other way around (N+1 redundancy). By keeping the rings under a certain size, the latency requirement of 3ms could be met (yes, speed of light is the limiting factor here). Note that the individual links in a ring are not allowed to physically cross paths: breaking fibers at the crossing point would incapacitate two links at the same time, and is therefore a single point of failure (SPoF).
The team came to the conclusion that leasing dark fiber from multiple, local providers could be interesting compared to leasing from a single nationwide provider: it would provide the right incentives to the providers and allow for substantial savings.
In the Netherlands, there are about 2–3 providers in each area that can provide a fiber link. An average network ring consists of 25 nodes, or 24 links. If each provider is allowed to offer two alternatives, this means there are 2^48, or 281,474,976,710,656 combinations. Checking for SPoFs for all these combinations is infeasible: even if checking a single combination takes only 1ms, we are looking at 8919 years of processing time. Even if we did this in parallel on 8919 cores, it would still take a year.
Our client was asking us to choose the best of 281,474,976,710,656 possible combinations of fiber offerings. We designed an algorithm that could do it in minutes.
So how did they do it? First, a list of conflicts was calculated: all pairs of fiber lines that would never be allowed to be in the ring simultaneously, as they would create an SPoF. We then developed an algorithm that more or less worked as follows:
- Find the cheapest candidate combination of links from a queue. The queue is seeded with a single combination that consists of the cheapest link for each segment in the ring.
- For this candidate, check each link for conflict with any other link in the ring. If there is a conflict: create two alternative candidates. One by replacing the first link, the other by replacing the second link. Add the combination to a queue.
- Check if the candidate is not too long. If it is: enqueue candidates generated by replacing each link with an alternative (if available)
- If both conditions are met, we have found the cheapest candidate. If not, continue with the next-cheapest in the queue. If the queue is empty, then sadly there is no candidate that can meet the criteria.
Note that even this algorithm can still run for a long time if there are a lot of conflicts. There is no guarantee that this algorithm will not have to exhaust all possible options before it arrives at a solution. It is, however, highly unlikely. We did add some tricks to dispose of options early on in the process (e.g. deleting offers that, in combination with the shortest offer on each other segment would still lead to a ring that is too long, and deleting those offers that conflict with all offers on a single other segment in the ring).