Rogue Scholar Posts

language
Published in quantixed

Hot on the heals of the post on how to downsize microscopy movie files, let’s look at ways to shrink the size of a PDF file. There’s several ways to tackle this – suggestions came from this thread on Mastodon. Scenario: you have created a preprint/manuscript/proposal in PDF format.

Published in Stories by Research Graph on Medium

How to use GROBID to extract text from PDF Author Aland Astudillo (ORCID: 0009-0008-8672-3168) GROBID is a powerful and useful tool based on machine learning that can extract text information from PDF files and other files to a structured format. One of the key challenges in knowledge mining from academic articles is reading the content of PDF files.

Published in tarleb
Author Albert Krewinkel

Typst, the new writing tool, was open sourced a couple of days ago. This is right up my alley of course, and I have a couple of thoughts on it, which I share here. What is it? Typst is a writing tool that’s described as a LaTeX alternative: it takes plain-text markup as input and can produce nice looking PDFs from that.

Published in tarleb
Author Albert Krewinkel

Typst, the new writing tool, was open sourced a couple of days ago. This is right up my alley of course, and I have a couple of thoughts on it, which I share here. What is it? Typst is a writing tool that’s described as a LaTeX alternative: it takes plain-text markup as input and can produce nice looking PDFs from that.

Published in tarleb
Author Albert Krewinkel

Setting the document font this way works for ConTeXt, LaTeX, and HTML output. The fonts used in docx or odt output must be controlled with the reference document instead. The default LaTeX engine is pdflatex, which only supports TeX’s own font format and cannot use the TrueType or OpenType fonts installed on the system. However, XeLaTeX was written with that in mind; switching to that engine allows to specify any font available on the system.

Published in tarleb
Author Albert Krewinkel

Setting the document font this way works for ConTeXt, LaTeX, and HTML output. The fonts used in docx or odt output must be controlled with the reference document instead. The default LaTeX engine is pdflatex, which only supports TeX’s own font format and cannot use the TrueType or OpenType fonts installed on the system. However, XeLaTeX was written with that in mind; switching to that engine allows to specify any font available on the system.

Published in tarleb
Author Albert Krewinkel

A question came up on the Lua mailing list, asking whether there was a PDF version of the Lua manual. This is, of course, the home domain of pandoc, and I got nerd-sniped into producing a PDF (and ePUB) version of the manual. This is a good opportunity to showcase some pandoc features. The post describes the process of going from an HTML web page to a PDF file via LaTeX and pandoc. We will see how to quickly convert documents with pandoc;

Published in tarleb
Author Albert Krewinkel

A question came up on the Lua mailing list, asking whether there was a PDF version of the Lua manual. This is, of course, the home domain of pandoc, and I got nerd-sniped into producing a PDF (and ePUB) version of the manual. This is a good opportunity to showcase some pandoc features. The post describes the process of going from an HTML web page to a PDF file via LaTeX and pandoc. We will see how to quickly convert documents with pandoc;

Published in rOpenSci - open tools for open science
Author Jeroen Ooms

Last month we released a new version of pdftools and a new companion package qpdf for working with pdf files in R. This release introduces the ability to perform pdf transformations, such as splitting and combining pages from multiple files. Moreover, the pdf_data() function which was introduced in pdftools 2.0 is now available on all major systems.Split and Join PDF files It is now possible to split, join, and compress pdf files with pdftools.