Telstra is about to switch on an open source SDN controller it has built for its programmable network.
The controller, called OpenKilda, has been published to GitHub, and Telstra has invited outside contributions as well as other companies to modify it for their own networks.
Programmable network is a software-defined network asset that Telstra bought from Pacnet. It consists of 35 points of presence in 11 countries.
“What it is is a SDN you can control from a GUI that we operate,” software architect Jeff Young told last month’s Apricot 2018 conference.
“So you can go out and grab bandwidth between Los Angeles and Hong Hong, you can run NFVs [network function virtualisation] on either side and then connect into exchanges, AWS, Azure etc.”
The programmable network uses the OpenFlow protocol to manage and direct traffic flows within the network. Flows are managed by a controller, which runs out of Hong Kong.
“Everything is coming back to that controller to make changes in the network,” Young said.
“Right now, our current controller is managing greater than one million flows.”
But for Telstra, the direction it wanted to take the programmable network in future meant added network complexity, and that led to a review of the central controller.
Telstra initially turned to the market. While it found plenty of commercial and open source controllers available, they fell short of its needs.
“If I look at some of those other controllers - and I won’t mention which ones - we did experiment with them, and we did try to actually implement one,” Young said.
“It didn’t work as well as we thought so we were back to the commercial controller that we have, and we kicked off a platform to go and create this new OpenKilda [instead].”
Telstra Networks principal consultant Craig Mulhearn said late last year that “many aspects” of the telco’s activity around the programmable network were “driven by necessity”.
“OpenKilda, for example, we created to deal with the unique global footprint of [the programmable network] (with high latency control plane paths) against feature set requirements that include, for example, auto path reroute based on real time latency/packet loss/jitter measurements,” Mulhearn wrote.
“Put simply, although there are plenty of SDN controllers in the market today, including open source versions, to deliver the features our customers wanted, across the geography we wanted to offer them (ie the world), we needed to DIY.”
Young said the telco wanted a controller that could handle sub-second failover, auto re-route based on real-time latency/packet loss/jitter measurements (which it had in its existing controller), self-healing and optimisation, as well as zero touch deployment and upgrade.
“We wanted it to scale certainly horizontally - [across a] number of switches, number of flows - [because] everyone wants to make the network as big as they possibly can,” he said.
Whatever Telstra built would also ideally support more complex actions applied to packet and flows within the network, as well as the telco’s existing network statistics collection.
“Stats collection is a huge thing for us,” Young said.
“We have a big OpenTSDB [time series database running on Hadoop] so anything we did with a controller would certainly have to continue with that.”
The new controller would also need to be able to measure end-to-end latency on every flow in the network, so Telstra could stay within customer service level agreements (SLAs).
“If we get outside of an SLA we want to be able to go back and offer - or at least move - a customer back within the SLA, and we want latency measurements on every flow that we provide,” Young said.
“So those were our starting requirements for OpenKilda.”
Unpicking the architecture
“So messages come in from the switches, enter the Kafka bus, and then we can act on them and do different things with them with Storm,” Young said.
“We can write our own spouts and bolts [in Storm] that process these messages way out in the hinterlands and take actions on them rather than having to run them all back to Hong Kong and have the controller act on them there.
“The controller is still going to get most of them but there are a lot of messages that the controller probably doesn’t even need to see.”
OpenKilda also uses the Neo4j graph database management system as well as the existing OpenTSDB to collect network statistics.
“We keep all stats from day one,” Young said. “If you want to go back and look at what you were running a year ago, we’ve got it.”
Invites outside participation
Young said OpenKilda had achieved on most of it original objectives, though there were a couple - such as sub-second failover - that were still being worked on.
With the controller available on Github, he was keen for feedback as well as external assistance “to improve on what we’ve done”.
Young noted that most of his - and his team’s - time was spent on adding products and features to the programmable network itself.
“This is what takes up all of our daytime, and whatever else we have is our volunteer action to create OpenKilda,” he said.
“That’s why I’m out here seeing if I can’t drum up some support because really what I’m after is your time. We spend our spare time this way, maybe you can too.
“Frankly once you release something like this we don’t know what will happen with it but this is what we’d like to see.”