Publicados in rOpenSci - open tools for open science
Autor Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!