PaddleOCR 3.5 Adds Transformers Backend for Document AI Pipelines

PaddleOCR adds a Transformers inference option

PaddleOCR 3.5 introduces a pluggable inference backend architecture. Supported models (PP-OCRv5, PaddleOCR-VL 1.5) can now run with Hugging Face Transformers as the inference runtime by setting engine="transformers" in the Python API or CLI.

Previously, PaddleOCR models required either Paddle's static or dynamic graph engines. The new interface abstracts this choice behind an engine parameter. Developers configure backend-specific options like dtype, device placement, and attention implementation through engine_config. PaddleOCR still manages the OCR and document parsing pipelines; the backend is just the runtime layer.

The integration is available now. A live demo runs on Hugging Face Spaces. Installation requires PaddleOCR 3.5.0, PaddleX 3.5.2, Transformers 5.4.0 or later, and PyTorch for your target hardware.

Integration friction drops for PyTorch-first shops

Document ingestion is the bottleneck in RAG, Document AI, and agent pipelines. Weak OCR or parsing sends corrupted or missing context downstream, breaking retrieval or fact accuracy. Teams have known this; the problem is operational.

Many teams standardize on PyTorch and Transformers for model loading, training, and deployment. Adding PaddleOCR meant importing a second inference runtime and managing separate model artifact paths. The Transformers backend eliminates that dual-runtime penalty for teams already serving Transformers models in production.

This is not a performance play. PaddleOCR's default Paddle static graph backend is still the throughput choice. The Transformers option trades maximum OCR speed for operational simplicity: one dependency tree, one Hub-compatible model discovery surface, one inference orchestration path.

Decide your backend based on your constraint

Use the Transformers backend if your team already owns PyTorch/Transformers infrastructure and integration cost is your limiting factor. You gain familiar APIs and Hub-native model distribution.

Stick with the Paddle static graph backend if OCR or document parsing throughput is the hard constraint and you have spare operational budget.

The release is explicitly not a consolidation. PaddleOCR continues to expose both backends. The win is optionality: pick the inference runtime that matches your stack, not the other way around.

Start by testing on a non-critical document batch (invoices, contracts, charts) in your environment. Measure cold-start latency, per-document cost, and memory footprint under your actual page volumes and hardware. Then decide whether to migrate or stay.

PaddleOCR 3.5 Adds Transformers Backend for Document AI Pipelines

Our Take

Why it matters

Do this week

PaddleOCR adds a Transformers inference option

Integration friction drops for PyTorch-first shops

Decide your backend based on your constraint

One daily brief. Every story gets a hype verdict.

Related stories

Fenergo hires Finastra CRO to lead global revenue expansion

UK banks have 18 months to map third-party risks under PS26/2

Quantifind Lands $200M to Scale AI-Native Financial Crime Detection