At a Glance
Extracting data manually from tons of documents is time-consuming, challenging, and expensive, given that documents like Invoices and receipts don’t follow a standard format and dedicated time is needed in order to analyze them. This online business payments solution explored numerous solutions that yielded low accuracy results.
After using Google Vision API OCR to extract text from images, custom machine learning models were developed in order to give context to the OCR results and predict the location of specific fields such as total, invoice date, due date, etc. reducing the time to process invoices.
The initial results had a 30% accuracy increase over commercial solutions. As the solution was integrated with the company’s usual workflow, a continuous cycle of model experimentation allowed to further improve accuracy and accelerate the data extraction process. Unlike other business payment providers leveraging OCRs (optical character solutions), The resulting AI solution for computer vision brings unparalleled consistency and predictability to the invoice process.
There’s opportunity when it comes to automating back- and middle-office tasks. That’s why the leading online business payments solutions company and CI&T decided to tackle this on to ensure a more seamless process of reconciling invoices received and scheduled payments.
The company receives millions of invoices every month from their 2.5 million members processing over $50 billion per year in payment volume. Though only a portion of their customers take advantage of their service of manually processing and scheduling members’ customer payments. Their goal is to have all their members to have this end-to-end service.
The challenge: extracting specific data accurately from invoices to automate bill payment scheduling. The issue is that invoices don’t have a standard template and comes in various formats. Optical Character Recognition (OCR) solutions by itself can digitize the data but won’t provide context for that data. Is $100.00 the total amount or the unit price?
Though invoice documents may follow a certain informational hierarchy, computer systems still struggle to accurately extract specific data from them. Therefore precisely identifying a company name, due date, and the total amount of an invoice from a scanned image of the bill has been a challenge. After the payments solutions company tried various solutions that yielded sub-par results, that’s when we approached them to start a pilot project.
An initial evaluation of more than 10,000 invoices and custom machine learning models led to promising results regarding the ability of a model to provide context to data extracted through OCR.
That evaluation expanded to more than 50,000 invoices with models able to predict the location of specific fields of interest such as the total amount, invoice date, due date, etc.
The machine learning models were trained based on existing data from tens of thousands of invoices that were manually entered, providing a solid ground truth to evaluate the accuracy and confidence level for each field prediction.
By going beyond the OCR the solution would determine where specific fields were located within the document, as well as, the information around it.
The customized algorithms developed by CI&T’s Data Science Team used approaches such as text-mask clustering, region of interest by cluster, and label distance vectors to account for the large variability of templates and the visual information humans rely on when performing a similar task.
A complex series of computer vision algorithms were used to identify potential patterns. OCR was leveraged to build a model with an enhanced understanding of where key information was positioned on invoices. The approach made it easier for users to make sense of their large amounts of invoices and dramatically reduce the tedious time spent on manually entering data.
The initial version of the solution achieved 30% better accuracy than out-of-the-shelf market solutions tested by the company prior to starting the project. As the model solution was integrated with their workflow, a continuous cycle of model experimentation allowed to further improve accuracy and accelerate the data extraction process with invoices processed under 2 seconds.
Such a service can eliminate extraneous data entry by automatically processing invoices and presenting them for review and approval. making processes more efficient, and could save employees countless hours. Additionally, the customized algorithms provide an accurate and secure solution handling sensitive customer information.
The model created by CI&T leveraged Google products to help deliver better results and a solution that could easily be integrated into the company’s internal system:
- Google Vision API
- Google AppEngine (GAE)
- Google Cloud Endpoints
- Google Cloud Storage (GCS)
- Google Compute Engine (GCE)
- Google Data Studio
- Google BigQuery
- Google TaskQueue
The solution created was shortlisted for the 2017 AIconics Awards among over 200 entries. The AIconics are the world’s only independently judged awards for practical applications of AI in business. The awards recognize the achievements and advances of the firms pushing the development of these burgeoning technologies forward