OCR on identity documents

OCR on identity documents



MP DATA worked for a major banking/insurance sector group on personal, contractual and commercial document processing automation projects. In particular, scanned document management (identity cards/passports, bank details, etc.) is recurrent and time consuming every time an account or contract is opened. MP DATA was called upon to design and develop an algorithm to extract data from those structured documents.


Our consultants used different approaches based on computer vision techniques, optical character recognition (OCR), and deep leaning to develop an algorithm that could identify and extract data from scanned structured documents such as identity cards/passports or bank details for customers taking out insurance policies.

Having been trained on many documents, the algorithm is robust when faced with low quality scans (pixelization, misalignment, colorimetry). This solution makes agents’ work easier as they no longer need to manually enter customer data, thereby limiting processing time and typo errors.

The algorithm was industrialised as a micro-service that manages the infeed, processing and storage of the extracted documents and data in our client’s secure environment.


Time savings for agents
when entering customer data

Increased data reliability
by eliminating manual data entry

Efficiency and robustness
of the algorithm on low quality scans