Optus has provided its first major look in over two years at a digital assistant that can be summoned from inside voice calls, showing it facilitate a business conversation between different language speakers in real-time.
The telco demonstrated the assistant, which it is now being called Voice Genie, during a keynote at the Red Hat Summit 2019 in Boston last week, to sizable audience applause.
iTnews first revealed the existence of the assistant, which was then called Optus.ai, back in November 2016.
A product of Optus' Yes Lab, the assistant was designed to allow Optus and third-party developers to build and offer add-on services that could be accessed from inside an active voice call.
The telco went quiet after the assistant’s existence was publicly disclosed, but has clearly continued to develop the technology since.
“There is this perception that phone calls are dead,” Optus principal software engineer Vasily Chekalkin said.
“In fact they're lagging in terms of features - no emojis, no chat history, no multitasking.
“On the other hand our voice is the most natural way of communicating. Many organisations are working on improving their products using voice - voice assistants to answer your questions, voice biometrics to integrate you with your bank."
“A little while ago at Optus we asked ourselves can we bring modern voice technologies to the phone call,” Optus senior innovation manager Guillaume Poulet-Mathis said.
“We have capabilities to establish and carry native phone calls, we have towers, data centres and fibre channels, but our mindsets hadn’t changed. Phone calls have remained wires and switches.
“Today we’d like to show you a step change in how we see the phone call.”
Both Chekalkin and Poulet-Mathis were shown conducting a regular cellular voice call.
However, as the main part of the conversation began, Chekalkin engaged “Voice Genie” to “start taking notes”, transcribing the conversation in real-time.
He said the capability could be integrated into “email, calendar, contacts” - and presumably any other enterprise system that can connect to the Voice Genie API.
The telco then showed Voice Genie could go a step further by live translating what one person was saying on the call into another language.
Given the conference and demonstration was on the US East Coast, it was done as a pre-record, and therefore it is not clear just how accurately the live translation functionality performs inside a call under real-world conditions.
Voice as a platform
In an interview with tech broadcast service TheCube, Poulet-Mathis likened what Optus is doing to the app development ecosystem, where third parties are able to build software products to run on top of smartphone platforms.
In a similar way, Optus appears hopeful of creating an ecosystem of services by setting up the voice call as a kind of “open” platform.
“If our network becomes more open we have further opportunities to leverage this network to try to build new products, and there are plenty of products that we are also working on that are based on this idea that we can build products like people build apps,” Poulet-Mathis said.
“If you think of communication, language can be a barrier. The idea we digitise the phone call and then engineer products [to sit inside the call] is very exciting.”
In the case of translation and transcription, Poulet-Mathis said that having the services sit natively in the call was the key differentiator.
“Translation is something a Microsoft and many other cloud companies do really well,” he said.
“The value we add is to move from having an adhoc translation request - can you translate this - to integrating this in one of the most natural communication channels which is person to person. The phone call is a perfect place to start.”
The capability was demonstrated on the world stage at the Red Hat Summit - and not in Australia - and the connection appears to be that RedHat’s Openshift is a key technology enabler.
RedHat confirms this, saying that “as part of a voice innovation program, Optus recently used Red Hat OpenShift container platform to deploy a new generation of virtualised mobile core functions.”
Exactly how Optus uses OpenShift in its architecture was not disclosed.
“From a technology point of view, the telephony stack is a complex thing and if you want to integrate directly with telpehony systems and phone calls it’s challenging,” Chekalkin said.
“As software developers, what we tend to do when we get some complex things to solve is we abstract it away.
“For us it was the obvious solution - we needed to abstract away all this complexity and provide a very simple way of getting additional voice services within a phone call.”
Chekalkin said there were some additional technical challenges in embedding services inside voice calls, notably getting it to work over Australia’s vast geographical distances and not doing the processing in the cloud, since both aspects could introduce latency that soured the user experience.
“To avoid additional latency for the call we must be able to deploy our virtualised media functions on the same path as the call,” Poulet-Mathis said.