Image courtesy of Andy Fenton
Image courtesy of Feilding Public Library
Mahakala Thangka, courtesy Marian Bond
Home » Services » Content/Text Conversion » More About OCR

More About OCR

Optical Character Recognition (OCR) software converts scanned images of printed or typewritten pages to searchable and editable text.  We use a variety of software tools and have had great results with uncorrected OCR for clients and for the Stones Directories, which we produce for sale.

Our standard OCR services are fully automated where powerful software analyses the digitised images and identifies the text within, avoiding the necessity for operator intervention.

NZMS also offers customised OCR services for more challenging material.  This includes manual zoning of newspaper and journal text (where articles and headlines are not uniformly placed on the page) or isolating of marginalia and extracting abstract information from material.  We also offer the option of part-OCR of material, where specific parts can be OCR’d, eliminating extra costs for converting material that does not aid discovery.  This is particularly useful in Journals and Magazines where advertorial content with artistic fonts is confusing the output.  That said, our OCR systems allow us to "Pattern Train", whereby we "teach" the system how to recognise text of varying fonts in order to improve the accuracy of our conversion.

Not all documents are suited to OCR however - the accuracy level of OCR on handwritten text is very poor and is often not usable.  For documents that are not suited to OCR we offer transcription services ‘data entry’ or ‘keyboarding’ which means that a mix of two typists and a third arbiter or a typist and a verifier type the same information.  This is then compared and any discrepancies are highlighted and rectified.