$ pdftotext -layout input.pdf output.txt
Preinstalled in current versions of Debian, Ubuntu, et al.; Homebrew formula (brew install poppler
), raw source, and Windows binary also available. Beautiful conversion of QuickBooks invoice PDFs into plain text.
H/T: Linux Uprising
UPDATE: "Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk."
/nix | Apr 14, 2023