Google re-releases open source OCR software

By Matt Chapman

Sep 6 2006 11:59AM

Tesseract code unearthed from the HP crypt.

Google re-releases open source OCR software

Google has re-released an open source version of optical character recognition (OCR) software originally produced by HP.

The Tesseract program was developed by HP between 1985 and 1995 and in its final year was in the top three OCR packages in a competition organised by the University of Las Vegas (UNLV) in Nevada.

Google said in a statement that, although some people might wonder why the search giant was interested in OCR technology, it fitted in with the company's plans to make information available online.

"We are all about making information available to users, and when this information is in a paper document, OCR is the process by which we can convert the pages of this document into text that can then be used for indexing," said Eric Case on the official Google Code blog.

HP stopped working on Tesseract in 1995 and released the code to the Information Science Research Institute at UNLV a couple of years ago so that it could be developed for open source.

"UNLV was happy to oblige, but they asked for our help in fixing a few bugs that had crept in since 1995 (ever heard of bit rot?)," wrote Case.

"We tracked down the most obvious ones and decided a couple of months ago that Tesseract OCR was stable enough to be re-released as open source."

Google originally chose to keep the launch low-profile but today's announcement includes an advert for engineers to work on the project.

The software currently supports only English, does not include a page layout analysis module, struggles with greyscale and colour documents, and will not match the accuracy of the best commercial OCR packages currently available.

"Yet, as far as we know, despite its shortcomings, Tesseract is far more accurate than any other open source OCR package out there," wrote Case.

Got a news tip for our journalists? Share it with us anonymously here.

Tags:

Partner Content

Partner Content Cyber Engineering launches at ctrl:cyber with former Shelde founders

Partner Content Suntory Oceania’s $30 million IT transformation powers carbon-neutral multi beverage facility

Partner Content Identity at the Centre: Why AI Is Accelerating a New Security Imperative

Partner Content From hype to value: The AI trends set to shape 2026

Events

iTnews Executive Retreat - Security Leaders Edition

Single Windows image drove RedVDS disposable cybercrime server business

Australia social media ban hits 4.7 million accounts

ACMA tries to source unfixable Samsung handsets

Rio Tinto plugs AWS into innovative copper process

Microsoft patches single-click Copilot data stealing attack

Google re-releases open source OCR software

Tesseract code unearthed from the HP crypt.

Partner Content

Sponsored Whitepapers

Events

Most Read Articles

Defence's VMware contract climbs to $178m

Apple, Google strike Gemini deal for revamped Siri

India proposes forcing smartphone makers to give source code

China is closing in on US technology lead despite constraints

Most popular tech stories

Woolworths to incorporate agentic AI into its Olive chatbot

Australian Electoral Commission hits go on generative AI

eBay offers automated seller protections after Sendle service stalls

State of HR Tech 2025

Cochlear pilots voice-to-text Salesforce integration for lead management

HamiltonJet partners with digital services provider Fortude

SentinelOne signs distribution agreement with Sektor

Rapid7’s new SIEM combines exposure management with threat detection

The techpartner.news podcast, episode 3: Why security consultancy founder Kat McCrabb started with the hard stuff

Bluechip Infotech enters final stage of Goodson Imports acquisition

Blackberry celebrates "giant step forward"

Photos: Australian industry explores data for net zero

'Touch-free' smartphone controlled with head movements

Govt launches consumer tech label program for smart devices

Perth IoT vendor Digital Matter names new chief executive

Single Windows image drove RedVDS disposable cybercrime server business

Australia social media ban hits 4.7 million accounts

ACMA tries to source unfixable Samsung handsets

Rio Tinto plugs AWS into innovative copper process

Microsoft patches single-click Copilot data stealing attack

Google re-releases open source OCR software

Tesseract code unearthed from the HP crypt.

Partner Content

Sponsored Whitepapers

Events

Most Read Articles

Defence's VMware contract climbs to $178m

Apple, Google strike Gemini deal for revamped Siri

India proposes forcing smartphone makers to give source code

China is closing in on US technology lead despite constraints

Most popular tech stories

Woolworths to incorporate agentic AI into its Olive chatbot

Australian Electoral Commission hits go on generative AI

eBay offers automated seller protections after Sendle service stalls

State of HR Tech 2025

Cochlear pilots voice-to-text Salesforce integration for lead management

HamiltonJet partners with digital services provider Fortude

SentinelOne signs distribution agreement with Sektor

Rapid7’s new SIEM combines exposure management with threat detection

The techpartner.news podcast, episode 3: Why security consultancy founder Kat McCrabb started with the hard stuff

Bluechip Infotech enters final stage of Goodson Imports acquisition

Blackberry celebrates "giant step forward"

Photos: Australian industry explores data for net zero

'Touch-free' smartphone controlled with head movements

Govt launches consumer tech label program for smart devices

Perth IoT vendor Digital Matter names new chief executive

Log In