What is OCR?

Optical Character Recognition for all your documents

OCR (optical character recognition) is a method of extracting letters and words from images, or image based PDF’s. Many PDFs are generated without selectable text, that means you cannot simply highlight the text in the PDF, and copy/paste it just as you could not highlight text in a photo you took with your camera. This also means that most crawlers cannot read this image-based text either.

CityFind Site Search incorporates automatic OCR text conversion as it crawls your website pages. If it finds a document/PDF or image, it will perform OCR to extract the text that it uses in the search index. This means that even large libraries of large image-based documents can be crawled and indexed, and the text is then searchable using the CityFind Search Engine.

