Human understanding drives speech recognition research

By Ry Crozier

Nov 24 2008 2:28PM

Speech recognition systems are increasingly about understanding natural language and predicting probable reasons for a call, presenting challenges for researchers in the space, according to VeCommerce.

Human understanding drives speech recognition research

The Australian firm, which first started developing speech recognition systems in the 1990s, is once again claiming to be pushing the envelope by developing systems that can understand conversations – not just recognise individual pieces of speech.

“Speech recognition is nearly as accurate as the human ear in determining what gets said,” managing director of VeCommerce, Paul Magee, told iTnews.

Getting computer systems to actually understand what is said is challenging.

The reason for this is that the human ear typically picks up only a certain percentages of a conversation. It then uses ‘context’ stored in the brain to fill in the missing words to put together a complete understanding of what has been said.

The more different or foreign a subject is, the higher the need to listen with complete accuracy because there is less or in some cases no context to fill in any gaps, according to Magee.

Transferring that ability into speech recognition algorithms and systems represents the next logical evolution for the technology.

Magee uses a horse-racing gambling system as an example.

“The limitations of core speech recognition systems means it can be very hard to ‘hear’ the difference between ‘seven’ and ‘eleven’,” said Magee.

“In a gambling system, did the person calling in mean race seven or eleven? The challenge is how can the system make a sensible decision without asking the person what they said?”

Magee said newer systems were able to use ‘fuzzy logic’ to apply probabilities to all the potential options to work out the most likely one and gain the degree of certainty needed to make a correct judgement.

For example, if race seven has already been run, the caller is likely phoning about race 11. Or perhaps in race 11, is there a horse numbered seven?

“The system can infer certain things from what the user is saying,” said Magee.

The other major trend in the industry is towards biometric voice verification, according to Magee.

“Every one of us has a unique voice pattern,” he explained.

“It’s more accurate than a fingerprint in determining individual identity, and people aren’t scared about using their own voice [for verification].

“Iris or fingerprint scanners have hugely negative connotations when it comes to privacy. Voice is a non-invasive way to establish identity and people are voting with their feet,” said Magee.

Got a news tip for our journalists? Share it with us anonymously here.

Tags:

Partner Content

Unlock SMB Success with Microsoft Copilot

Partner Content Ransomware targets Australian SME false sense of security

Partner Content Australian organisations must act on security – or risk AI ambitions falling flat

Partner Content AI and quantum computing widen the machine identity security gap

Events

Most Read Articles

Telstra to axe 550 roles including in enterprise business

"It's an exciting time to be part of the health and aged care sector"

Insicon founder Matt Miller on the coming 'tsunami' of compliance and educating boards about cyber security

Orro claims Australia first with managed digital asset discovery service

Supply chain attack with malware hits Gravity Forms for WordPress

Google Gemini for Workspace vulnerable to prompt injection attacks

Vic Police uses AI to reformat community-submitted crime reports

NSW Police seeks lead for 'critical' network uplift

UN urges stronger measures to detect AI-driven deepfakes

Human understanding drives speech recognition research