Messaggi di Rogue Scholar

language
Pubblicato in rOpenSci - open tools for open science
Autore Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!

Pubblicato in rOpenSci - open tools for open science

rOpenSci’s software engineer / postdoc Jeroen Ooms will explain what images are, under the hood, and showcase several rOpenSci packages that form a modern toolkit for working with images in R, including opencv, av, tesseract, magick and pdftools. 🕘 Thursday, November 15, 2018, 10-11AM PST; 7-8PM CET (find your timezone) ☎️ Find all details for joining the call on our Community Calls page.Everyone is welcome. No RSVP needed.

Pubblicato in rOpenSci - open tools for open science
Autore Jeroen Ooms

Earlier this month we released a new version of the tesseract package to CRAN. This package provides R bindings to Google’s open source optical character recognition (OCR) engine Tesseract. Two major new features are support for HOCR and support for the upcoming Tesseract 4. hOCR output Support for HOCR output was requested by one of our users on Github.

Pubblicato in rOpenSci - open tools for open science
Autore Jeroen Ooms

Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google’s OCR library Tesseract. install.packages("tesseract") The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the magick package.

Pubblicato in iPhylo

Some quick notes on possibilities for text-mining BHL (in rough order of priority). Any text-mining would have to be robust to OCR errors. I've created a group of OCR-related papers on Mendeley: Skip to content Welcome Enter your email to continue with Mendeley Email Continue Sign in via your organization About Elsevier Terms and conditions Privacy policy Help We use cookies to help provide and enhance our service.

Pubblicato in iPhylo

While exploring ways to visually compare classifications I came across the Australian snake name Demansia atra , and ended up reading a series of papers in the Bulletin of Zoological Nomenclature discussing the status of the name (more fun than it sounds, trust me). For example, Smith and Wallach Case 2920.