OCR Tesseract Jobs

1 was found based on your criteria

  • Hourly – Less than 1 week – Less than 10 hrs/week – Posted
    A large number of PDFs (~300) need to be converted to UTF-8 text totaling ~3,000 pages of text. Some of these have image segments, and table segments. All have headers and footers that are not wanted. Basically, just the text body of the page is desired, with the content in its entirety. However, document titles (as on coverpage) are needed. Documents must first be “stitched together” into a single, master pdf document, with the output being a master text ...