Retab secures $3.5M and unveils the market’s most powerful document AI platform

The developer-focused document automation platform emerges from stealth with fresh funding, a new product, and an ambitious vision to power the next wave of vertical AI.
Retab secures $3.5M and unveils the market’s most powerful document AI platform

Retab, an AI agent for building document extraction pipelines, has raised $3.5 million in pre-seed funding alongside the launch of its platform.

Retab is a developer platform and SDK that transforms document processing for the era of large language models. Developers simply define the data schema they need, while Retab manages everything else, from dataset labelling and evaluation to automated prompt engineering and model selection.

The idea for Retab was born from the founders' early work building internal automation tools for document-heavy workflows in logistics. They quickly realised that their true breakthrough wasn’t in the end results, but in the orchestration layer they had created to make AI models work reliably and efficiently. That layer became the core of Retab.

Louis de Benoist, co-founder and CEO of Retab, shared: 

People keep building demos that look like magic, but break the moment you put them into production.

We lived that pain ourselves. Wiring up fragile pipelines just to extract a few fields from a PDF. We built Retab because it’s the developer-first platform we always wished we had.

Today, Retab’s all-in-one platform is used by dozens of companies to convert messy PDFs, handwritten scans, and other unstructured inputs into clean, structured data, without the need for brittle third-party tools. Users simply define the data they need, upload their files, and Retab handles everything else: from dataset labeling and extraction logic to evaluation and benchmarking. It intelligently routes tasks to the best-performing model and automatically switches as newer, more effective models emerge.

Importantly, Retab isn’t just another large language model. It acts as the intelligence layer that makes cutting-edge models from providers like OpenAI, Google, and Anthropic usable for real-world, high-stakes workflows. By managing the full lifecycle of document extraction with verifiable accuracy, Retab enables teams to replace manual processes with fast, accurate, and self-improving workflows across use cases like contracts, invoices, and compliance documents.

The platform delivers guaranteed performance through a system of intelligent checks and balances:

  • Self-Optimising Schemas - An AI agent automatically tests and refines instructions based on a user’s documents, maximising accuracy before the system ever goes live.
  • Intelligent Model Routing - The platform is model-agnostic. It automatically benchmarks and routes each task to the best-performing model for the job, whether the priority is cost, speed, or accuracy. This can make processes up to 100x cheaper than other solutions.
  • Guided Reasoning & k-LLM Consensus - Retab forces models to "think" step-by-step and uses a consensus mechanism among multiple models to quantify uncertainty, acting as a powerful safety net to ensure trustworthy results. 

Retab is the OS for reliably extracting structured data. It wraps the best models in a layer of logic that actually makes them usable with error handling and structured outputs. That’s what devs need if they want to build production apps, not just prototypes,

said de Benoist. 

With a lean team of just ten employees and a rapidly growing developer community, Retab is positioning itself as a core layer in the AI infrastructure stack, a tool designed not just to showcase what’s possible but to empower others to build with it.

Customers in logistics, finance, and healthcare are already benefiting from Retab. One major trucking company leveraged the platform to identify the smallest and fastest model configuration that met its 99 per cent accuracy requirement, significantly cutting operational costs. A financial services firm now uses Retab to extract detailed quantitative metrics and qualitative risk insights from 200-page quarterly reports, a task that once took a team of analysts several days. Other users are streamlining processes like claims handling, medical record processing, identity verification, and onboarding, with minimal setup required.

The round was backed by leading early-stage funds including VentureFriends, Kima Ventures, and K5 Global, alongside Eric Schmidt (via StemAI), Olivier Pomel (CEO, Datadog), and Florian Douetteau (CEO, Dataiku). 

Florian Douetteau, co-founder and CEO of Dataiku and an investor in Retab, noted that the broader adoption of AI across the economy relies on the ability to transform document-heavy operations into reliable, structured data that autonomous systems can effectively use:

On a large scale, this process hinges on quality control, cost efficiency, and rapid implementation. The team at Retab understands this thoroughly and is uniquely positioned to solve it for the thousands of AI-first companies that are emerging.

Looking ahead, Retab is expanding its capabilities beyond documents to apply its reliable extraction methods to websites. It’s also rolling out integrations with automation platforms like n8n, Zapier, and Dify to streamline workflows even further.

At the heart of Retab’s long-term vision is its goal to become the intelligent middleware layer between the world’s unstructured data and the AI agents that need to interpret it. Whether processing a loan file, a contract, or a customs manifest, Retab turns unstructured content into usable, safe, and programmable data.

The newly raised capital will fuel continued platform development and community expansion, enabling the company to scale its infrastructure to meet growing demand from vertical AI startups and internal innovation teams.

Lead image: Retab team | Photo: Uncredited

Follow the developments in the technology world. What would you like us to deliver to you?
Your subscription registration has been successfully created.