Rogue Scholar

Pubblicato 27 settembre 2019 in rOpenSci - open tools for open science

Autore Jeroen Ooms

Image processing is one of the core focus areas of rOpenSci. Over the last few months we have released several major upgrades to core packages in our imaging suite, including magick, tesseract, and av. This post highlights a few cool new features.

PackagesTesseractImagesOCRTech NotesScienze informatiche e dell'informazioneInglese

Tesseract 4 is here! State of the art OCR in R!

https://doi.org/10.59350/q9b7v-e9851

Pubblicato 6 novembre 2018 in rOpenSci - open tools for open science

Autore Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!

MagickTesseractCld2Cld3TaxizeScienze informatiche e dell'informazioneInglese

What's this bird? Classify old natural history drawings with R

https://doi.org/10.59350/p7699-dd726

Pubblicato 28 agosto 2018 in rOpenSci - open tools for open science

Autore Maëlle Salmon

In this new post, we’re taking a break from modern birding data in ourbirder’s series… let’s exploregorgeous drawings from a natural history collection!

PackagesTesseractImagesOCRTech NotesScienze informatiche e dell'informazioneInglese

Support for hOCR and Tesseract 4 in R

https://doi.org/10.59350/37079-8yg05

Pubblicato 14 febbraio 2018 in rOpenSci - open tools for open science

Autore Jeroen Ooms

Earlier this month we released a new version of the tesseract package to CRAN. This package provides R bindings to Google’s open source optical character recognition (OCR) engine Tesseract. Two major new features are support for HOCR and support for the upcoming Tesseract 4. hOCR output Support for HOCR output was requested by one of our users on Github.

PackagesTesseractOCRTech NotesScienze informatiche e dell'informazioneInglese

Tesseract and Magick: High Quality OCR in R

https://doi.org/10.59350/v2b9z-07s70

Pubblicato 17 agosto 2017 in rOpenSci - open tools for open science

Autore Jeroen Ooms

Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google’s OCR library Tesseract. install.packages("tesseract") The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the magick package.

PackagesTesseractTech NotesScienze informatiche e dell'informazioneInglese

Tesseract Update: Options and Languages

https://doi.org/10.59350/b5pf8-h3m68

Pubblicato 8 dicembre 2016 in rOpenSci - open tools for open science

Autore Jeroen Ooms

A few weeks ago we announced the first release of the tesseract package: a high quality OCR engine in R. We have now released an update with extra features. Installing Training Data As explained in the first post, the tesseract system is powered by language specific training data. By default only English training data is installed. Version 1.3 adds utilities to make it easier to install additional training data.

PackagesTesseractScienze informatiche e dell'informazioneInglese

The new Tesseract package: High Quality OCR in R

https://doi.org/10.59350/r1f37-rc724

Pubblicato 16 novembre 2016 in rOpenSci - open tools for open science

Autore Jeroen Ooms

Optical character recognition (OCR) is the process of extracting written or typed text from images such as photos and scanned documents into machine-encoded text. The new rOpenSci package tesseract brings one of the best open-source OCR engines to R. This enables researchers or journalists, for example, to search and analyze vast numbers of documents that are only available in printed form.

Messaggi di Rogue Scholar

Updates to the rOpenSci image suite: magick, tesseract, and av

Tesseract 4 is here! State of the art OCR in R!

What's this bird? Classify old natural history drawings with R

Support for hOCR and Tesseract 4 in R

Tesseract and Magick: High Quality OCR in R

Tesseract Update: Options and Languages

The new Tesseract package: High Quality OCR in R