LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

•

Original Author:Said Taghadouini et al.

•

January 20, 2026

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

Image generated by Gemini AI

LightOn has launched LightOnOCR-2-1B, a 1B-parameter multilingual model that transforms document images into organized text without traditional OCR. It excels in accuracy on OlmOCR-Bench while being 9x smaller and faster than its predecessors. The model predicts bounding boxes for images and employs innovative training strategies. Checkpoints and datasets are available under Apache 2.0, enhancing accessibility for further research.

LightOnOCR-2-1B: A Breakthrough in Multilingual OCR Technology

LightOn has unveiled the LightOnOCR-2-1B, a multilingual vision-language model designed to transform document images into structured text with remarkable efficiency. This model, comprising 1 billion parameters, promises to outperform traditional Optical Character Recognition (OCR) systems.

LightOnOCR-2 has demonstrated state-of-the-art performance on the OlmOCR-Bench benchmark and is 9 times smaller and significantly faster than its predecessors.

Key Features

Normalized Bounding Box Prediction: Predicts normalized bounding boxes for embedded images, improving utility for complex layouts.
Reinforcement Learning With Rewards: Refines performance through IoU-based rewards, ensuring more accurate text extraction.

LightOn has released the model checkpoints under the Apache 2.0 license, along with the accompanying dataset and the new LightOnOCR-bbox-bench evaluation. This positions LightOnOCR-2-1B as a significant advancement for applications requiring quick and accurate text extraction from multilingual document images.

Share this article

Twitter Facebook LinkedIn WhatsApp Reddit

LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR

LightOnOCR-2-1B: A Breakthrough in Multilingual OCR Technology

Key Features

Related Topics:

Share this article