GitHub - Goldziher/kreuzberg: Document intelligence framework for Python - Extract text, metadata...

GitHub Daily Trend - A podcast by VoiceFeed - Wednesdays

https://github.com/Goldziher/kreuzberg Document intelligence framework for Python - Extract text, metadata, and structured data from PDFs, images, Office documents, and more. Built on Pandoc, PDFium, and Tesseract. - Goldziher/kreuzberg