Telstra directs automation at triaging a 5G misconfiguration

By
Follow google news

On the path to developing autonomous networks for 2030.

Telstra is fine-tuning the technology stack used for network automation under its Connected Future 30 strategy on a common misconfiguration that can disrupt 5G network slicing.

Telstra directs automation at triaging a 5G misconfiguration
Telstra's Kenny Cheng.

Senior chapter lead Kenny Cheng told a recent Red Hat Ansible Automates event in Sydney that misconfigured slice IDs on 5G wireless base stations could result in between 300 and 400 support tickets being raised a month.

The Slice ID directs device traffic to a virtual segment of the mobile network that may be tailored to handle specific needs or requests.

These misconfigurations “can disrupt customer experience as devices move across nodes [base stations] within the same tracking area,” Telstra said in a slide deck.

The automation push is around triaging alerts about misconfigured slice IDs, collecting information up to and including the change that needs to happen and forwarding that to the radio access network (RAN) team in a format that can be ingested by their own automation tool, which then schedules the change.

Cheng said that typically an on-call 5G core support team member would pick up an alert paged to them via PagerDuty and then investigate it - “log into the node that generated that alert, look at all the configuration, gather that configuration, figure out what is it that the radio side [of Telstra operations] has to realign, and then probably send it as an email - ‘Hey, in your next change request window could you please realign this configuration?’”

The occurrence of a misconfigured slice ID is frequent enough that the information-gathering exercise might only take an on-call support staffer “around 10 minutes or so”, Cheng said, but this still adds up.

“When you look at the number of alerts across the month, you’re talking about potentially 300-plus,” he said.

“In a manual [triage] process you’ll probably do that in batches and bulk, but on average you’re talking about a 12-24 hours of an individual’s time having to support this.”

Using the Ansible Automation Platform (AAP), a detected misconfiguration triggers an Ansible rulebook that “simulates exactly what the [support] person would’ve done” to gather information and forward it to the RAN team to action.

“What you’re seeing is the introduction of AAP into the original support aspect of it,” Cheng said.

“What resulted … is ... zero touch automation, so within a minute you’ve now essentially sorted out that issue as opposed to potentially someone being in your team having to address the alert, logging in, doing that procedure instead.”

Cheng later clarified that with automation running, the on-call engineer would still be alerted to the issue - but is likely to find it’s been addressed prior to them having to do anything.

Alerting on-call support is still useful as a backup, however, in case of an issue with the automated workflow being executed.

“The alert will still happen. It’s just that within a minute, the person might find it, and they see it’s already closed - done,” he said. 

“We still need the alert to happen because what if our workflow didn’t work? The ticket won’t get acknowledged. 

“The original or first process becomes the fallback, as opposed to the main primary support. So the person might still get the alert but when they click into it, it’s already been solved by the workflow/AAP.”

"Too ambitious"

It appears that automating the triaging of slice ID misconfigurations wasn’t first on Telstra’s automation wishlist, but that it proved to be a practical use case for the technology.

Cheng indicated that the team had been “too ambitious” in its initial attempts at using the automation technology.

“We tried to do too much at the start,” he said.

“What was successful was to actually be a bit more practical. What are the small day-to-day things that we could tackle? 

“Get those runs on the board first, it gives you extra time, and then you can start to tackle the more complex ones.”

The path to autonomy

Network automation is a key part of the telco’s Connected Future 30 strategy.

“One of the major objectives is around the Telstra Autonomous Network, and part of the challenge here is where we need to get to level four autonomy by about 2030,” Cheng said.

“We’re going to actually try to get towards that whole goal in the global networks and technology division [of Telstra].”

Level four autonomy was described on a presentation slide as “featuring advanced self-managing capabilities like self-configuration, self-healing and self-optimisation, driven by AI and machine learning”.

“Human involvement is minimal, focusing mainly on strategic oversight rather than operational control.”

Cheng said that his department wanted to hit level two autonomy - and “level three in a lot of situations” - by 2028.

Under level two, systems can “automatically execute specific tasks based on predefined rules, 

reducing the need for constant human input. However, human operators are still required for complex decisions and to manage exceptions beyond the scope of the automation.”

Meanwhile, level three denotes a “conditional autonomous network” that is able “to make decisions based on real-time context and predefined conditions, enabling more adaptive and intelligent operations.”

“While automation handles most routine scenarios, human oversight is still needed for governance and handling complex or unforeseen situations,” the slide states.

The Red Hat presentation adds significant context and detail to the understanding of Telstra’s autonomous network efforts, and where the telco is at.

Back in March, Telstra said it “demonstrated an AI-enabled self-healing capability, in collaboration with Red Hat, Dell Technologies and Cisco.”

That use case was different, however - “to autonomously detect and resolve an unplanned infrastructure outage by shifting critical network applications to healthy hardware in just minutes compared with hours.”

Cheng suggested that such use cases were on the horizon in his specific domain, as Telstra developed maturity with the supporting technology stack.

“As we start looking forward, we want to be a bit more ambitious - [tackling] hardware failures, things that are going to be more critical,” he said.

“How do we leverage event-driven Ansible and then eventually look at some of the more AI intelligence capability that might also come along to help guide or steer that decision making process as well?”

Add iTnews as your trusted source

Got a news tip for our journalists? Share it with us anonymously here.
Copyright © iTnews.com.au . All rights reserved.
Tags:

Most Read Articles

Kmart Group to expand RFID tagging to more products and to Target

Kmart Group to expand RFID tagging to more products and to Target

Federal Parliamentary Computer Network set for its "most significant" upgrade

Federal Parliamentary Computer Network set for its "most significant" upgrade

WA man jailed for at least five years for evil twin attack

WA man jailed for at least five years for evil twin attack

Optus fast-tracks network operations insourcing from Nokia

Optus fast-tracks network operations insourcing from Nokia

Log In

  |  Forgot your password?