When Ericsson was founded over 140 years ago, we still had an analogue world where contracts, orders, and processes in general were either handwritten or machine processed by a typewriter. Instead of digital databases, cellars were packed full of shelves and boxes where source material would be kept in archives categorized either alphabetically or chronologically. Metadata came in the form of card catalogs holding information regarding the author, title, subject, etc. and indexing and information retrieval would be carried out manually by a person fetching the right document.
Converting information into a computer-readable format, in other words digitization, began with the advancement of electronics that were capable of carrying logical operations. When personal computers arrived on the scene, we suddenly had the power to delete a whole sentence without leaving a trace. We could also store the digital data in our local computer or in a remote database with the help of Internet protocols. Indexing is also no longer a problem as several types of metadata are used to discover resources and information retrieval is just one click away.
AI OCR technology: What is it?
For companies born in the analogue era with massive amounts of data archived in paper format, there is significant interest in digitizing printed texts so they can be electronically processed later on. One such technology that enables the conversion of images of typed, written, and printed text is artificial intelligence-aided optical character recognition, otherwise known as AI OCR.
AI OCR is a combination of machine learning and computer vision algorithms. These algorithms analyze document layout during pre-processing to pinpoint which information should be recorded. They might also normalize the aspect ratio of a document, clean up lines and boxes, as well as correct any angular deviation produced while scanning. An OCR engine then extracts text from the scanned document. Early OCR algorithms would use light and a photocell to compare an image of a glyph against a stored glyph image. More advanced methods decompose the glyph into vectorized features and by using clustering algorithms, the nearest match is computed between the vectorized features and stored glyphs.
How AI OCR works
The flow of events for an automated document processing with OCR is shown in Figure 1. Generally, they all follow the same structure:
- Data input (the document) is gathered from a database, pulled from one of the front-end systems such as a robotic process automated bot, an email, or others.
- The text is pre-processed to smooth the edges, increase contrast, correct angular deviations etc.
- The neural-based automatic document classification technology enables sorting of documents by types (e.g., driver’s license, bank statement, tax form, contract, invoice) and custom subcategories (e.g., invoices from vendor A, invoices from vendor B) by identifying text content and image patterns.
- The neural machine for classification defines a document type and also selects a correct document definition for further content processing.
- After that recognition of specific fields is done the structured or semi-structured text is extracted from the document and exported to the destination system.
If desired or needed, the AI OCR allows human verification which is carried out by setting a confidence level threshold. If this threshold is not met, it results in a manual verification before the data is exported to destination system. The final output of this process might be an XML, JSON, CSV, XLSX/XLS, TXT or DBF file.
Example of AI OCR in use: Tracing and accessing purchase order data
AI OCR is a versatile technology and can be deployed in many different scenarios, and for many various desired outcomes. Today, we are deploying the technology within Ericsson Group IT function to enhance the traceability and accessibility of customer purchase orders, and it is already producing significant efficiency and cost savings for the business.
The use case focuses on Ericsson’s Customer Purchase Order Repository (CPOR), a global repository for the collection of customer purchase orders, that allows a registry for customer purchase orders, search, display CPO (customer purchase order) and support for analytics. This repository makes it easier for Ericsson’s finance teams, among others, to trace and access customer purchase orders at any time.
The CPOR contains purchase orders from customers worldwide and the purchase order (PO) template varies from customer to customer or within the same customer. It is difficult to extract data from various PO templates using the traditional data extraction tool. If the templates would change after the development, it would require redevelopment of everything from scratch. Migrating to the AI OCR tool in the PO extraction process reduces the development time to train the AI. This will then extract the data from the PO for both the new template and include the changes to the existing template.
CPOR and AI OCR flow
The user uploads the customer purchase order in the CPOR user interface (UI) or sends it to a common mailbox where the CPOR tool will pick the file and place it in the secure file storage for further processing. The AI OCR tool monitors the secure storage location for any new files and picks them up for the data extraction. Once the tool picks the files, the AI OCR tool automatically identifies the customer’s name and starts to extract the data based on the template training. After data is successfully extracted, the AI OCR sends the extracted data as an XML file format to the CPOR application programming interface (API) to get updated in the CPOR systems. If any validation is required, based on business rules, which were configured on the AI OCR, it would send to the validation team to examine the extracted result and only after confirmation, data will be sent to the CPOR application.
Introducing AI OCR technology into the existing CPOR application improves the data extraction quality and reduces the development time of the PO process. Thus, resulting in more accurate and timely business decisions.
Benefits of AI OCR
Is there anywhere else the AI OCR offers value? AI OCR tools can offer benefits in almost any use case where repetitive human work is required to extract information from documents, non-searchable PDFs or images. By removing the common pitfalls associated with manual entry, the risk of human error also decreases. All in all, this digitization technology not only improves the traceability of documents, but it also enables organizations to adhere to compliance guidelines by having a central digital repository.
If we look to the future, the AI OCR service for document processing is one of the digital technologies that will enable the vision of an intelligent automation operating system, and one that learns from previously delivered automation and AI use cases to drive radical transformation and deliver superior business value.
In our next post, we will explore the downstream applications of OCR textual translation and transliteration. Stay tuned!
Find out more about AI in networks
Learn more about Ericsson’s journey to future technologies.