French AI company Mistral has launched Mistral OCR.

Last week, French AI company Mistral launched Mistral OCR.

Mistral OCR is a document understanding optical character recognition API of an entirely new standard. Unlike traditional OCR tools, Mistral OCR can identify and understand various elements within documents with unprecedented accuracy, including images, text, tables, and complex content such as mathematical equations. It supports input in the form of images and PDF files, and outputs organized text and image content.

Mistral OCR is particularly well-suited for integration with Retrieval-Augmented Generation (RAG) systems to process multimodal complex documents, such as slideshows or PDFs with rich content.

Mistral has set Mistral OCR as the default document understanding model for millions of users on the Le Chat platform. The newly launched API (mistral-ocr-latest) is priced at approximately 1000 pages per dollar (about twice the efficiency under batch inference conditions).

Mistral OCR Highlights

Outstanding complex document understanding capability
Mistral OCR has a strong understanding of documents containing complex elements such as interwoven images, mathematical expressions, tables, and advanced typesetting (such as LaTeX), especially suitable for scientific papers rich in charts, formulas, and images.
Native support for multi-language and multi-modal content
Mistral OCR stands out in its multilingual processing capabilities, accurately parsing thousands of fonts and languages from all continents around the world, making it particularly suitable for global organizations and enterprises with local characteristics.
Top-tier performance benchmark
In rigorous benchmarking, Mistral OCR consistently outperforms other leading OCR models. It excels in overall performance, mathematical content recognition, multilingual support, scanned document recognition, and table recognition.
Fastest in its class
The Mistral OCR model is lightweight and processes at a speed far exceeding similar products, with a single node capable of handling up to 2000 pages per minute, making it especially suitable for high-throughput application scenarios.
Documents as prompts, structured output
Mistral OCR introduces a new approach where documents serve as prompts, enabling users to extract and structure document information more accurately, such as in JSON format, which can be used to build more advanced automation workflows.
Selective local deployment to meet sensitive or confidential data requirements
For institutions with high data privacy requirements, Mistral OCR offers a selective self-deployment option to ensure the security of sensitive or confidential information.

Use cases

Digitization of scientific research: Multiple research institutions use Mistral OCR to convert scientific papers and journals into AI-readable formats, enhancing the efficiency of scientific research collaboration.
Heritage and Cultural Relics Protection: Cultural protection organizations and non-profit institutions use Mistral OCR to digitize historical documents, expanding their reach.
Optimizing Customer Service: Customer service departments use Mistral OCR to convert manuals and documents into searchable knowledge bases, improving response times.
Technical Document Conversion：Mistral OCR helps companies convert technical literature, engineering drawings, educational notes, presentations, and legal documents into AI-ready formats, enhancing the efficiency of document processing.

Outstanding performance

In rigorous benchmark testing, Mistral OCR consistently outperforms other leading OCR models. For a fair comparison, we selected mainstream products such as Google Document AI, Microsoft Azure OCR, Gemini, and GPT-4o for contrast. The test data covered mathematical formulas, multilingual content, scanned documents, and complex tables. The test documents included various PDF and image formats commonly found on the web. The evaluation results showed that Mistral OCR excelled in overall performance, math recognition, multilingual support, scanned document recognition, and table recognition, maintaining a leading position.

Native multilingual support

Since its inception, Mistral has been committed to serving global users by continuously optimizing and expanding the multilingual capabilities of our models. The newly launched Mistral OCR further enhances this ability, capable of efficiently parsing, understanding, and accurately transcribing thousands of scripts, fonts, and languages from all continents around the world.

This all-round language support can not only help global enterprises deal with documents of different language backgrounds, but also provide efficient solutions for niche enterprises focusing on local markets.