Digital transformation has become mandatory to businesses that want to innovate, increase creativity, and compete in the global market. Whether in deploying a process automation solution only for businesses or in building a comprehensive digital transformation program, one of the first basic indispensable steps is to digitalize text documents using Optical Character Recognition (OCR) as the core technology.
In order to solve that issue, Mr. Hoang Anh (FPT Software) and colleagues have researched and developed akaDoc – OCR Platform to help digitalize all documents in companies and organizations. akaDoc allows users to customize the definition of data areas they want to convert. Therefore, it can be applied to any type of documents, and support many different languages.
Within 3 months, based on a number of small OCR solutions, members of FPT Software released akaDoc. akaDoc incorporates Optical Character Recognition (OCR) technology along with the most modern machine learning techniques, and also Natural Language Processing (NLP), is able to help save 60-80% of the cost of the existing data input process at companies and organizations.
This product supports automatic extraction of content in photographs or scanned documents and then digitizes data into the database. Accordingly, users easily create and save templates for documents by identifying valuable information fields. Documents of similar format are automatically categorized and processed to extract information for helping businesses to automate business processes.
akaDoc applies image processing techniques to solve most common errors and improves image quality by automatically removing noise or fonts, rotating and enlarging images before being extracted. With the power of natural language processing algorithms, akaDoc can reduce 60-90% misidentification cases and use updated data as an AI engine training input for helping OCR function on get smarter. As a result, businesses can make decisions quickly, while ensuring data privacy and information security.
Mr. Vu Minh Phong said: “To do this, the project team directly processed the post OCR processing, the most complicated stage to automatically correct the post-OCR error. Specifically, applying some machine learning methods and Probability & Statistics to correct the error phrases in the process of reading and converting information on images into words. ”
This solution also supports building systems on the Cloud or at the customer’s infrastructure. Product kits include Webserver, API SDK on Cloud and Mobile SDK. The project team also announced the entire API of the OCR platform so that the units in FPT (especially FSOFT) can use it to develop their own digital solutions.
Especially, akaDoc is the fruit of the effort of many technology experts, international students from the US, valedictorian or talented engineers of famous universities such as FU, HUST, and University of Technology working at FPT Software now. All of them share the same goal of making akaDoc become one of the best digitization document products in the world.
Currently, akaDoc is being deployed at about 30 enterprises and corporations in banking, insurance, government, healthcare in Vietnam, Malaysia and Indonesia with outstanding accuracy compared to other solutions. In near future, based on the OCR platform of akaDoc, the project team will continue to launch digital products / solutions in the ecosystem including eBizCard, Skill Inventory, Customer Onboarding, Invoice Automation, etc.