The Modern Document Processing Stack
(github.com/marcelmarais)
This is a production-ready document conversion and processing engine (and primarily a wrapper of other tools). It uses open-source libraries to convert common file formats (PDF, DOCX, etc.) and web content to Markdown—a format that is friendly for LLMs and embedding models.
This is a production-ready document conversion and processing engine (and primarily a wrapper of other tools). It uses open-source libraries to convert common file formats (PDF, DOCX, etc.) and web content to Markdown—a format that is friendly for LLMs and embedding models.