Deep Document Understanding
for unlocking the value of your digital document assets
The ability to process and extract key information from large and complex documents is extremely valuable for a number of industries
Gain speed and agility in your business processes dealing with such unstructured born-digital documents in PDF and Doc formats
Extract information from documents, validate document data against business rules, compare 2 or more documents, categorise and search documents based on concepts and key terms
Aunwesha is your partner in moving from document chaos to insight mining for improving decision making and operations by providing 2 unique digital platforms:
- LearnITy™ Knowledge Engine (LKE) is a document understanding platform built on AI and NLP technologies that ingests organisational born-digital document assets (PDF, Word, Excel) and incrementally converts them to actionable corporate knowledge with provision for validation and augmentation by human experts
- LearnITy™ Conversation (LC) is an AI and NLP enabled platform for easily building Chatbots and connecting them to organisational knowledge assets
These platforms are simple yet powerful, cost effective, easily customisable and implementable, and may be used across industry verticals (Banking and Finance, Insurance, Health, Legal, etc.) and business functions (Corporate, Finance, Legal, Customer Service, Operation, HR, etc.).
AI in general, and Natural Language Processing in particular, has tremendous social impact by simplifying and democratising the access to knowledge hidden in organisational document assets. The insights obtained by analysis of these assets are now made available to a larger audience thus creating a level playing field. Aunwesha provides the unique combination of state-of-the-art AI with utilisation of human domain knowledge in its suite of platforms.Dig deeper how Aunwesha’s deep document understanding empowers
Go beyond simple IDP: The business case for Deep Document Understanding
Data is the new oil for the economy. 80% of business-relevant information originates in unstructured form, primarily text. Businesses run on documents, they are the heart of all business processes. Over 500 billion MS Office documents are created every year. There may be up to 2.5 trillion PDF documents in the world
- Many of these documents are large and complex where related information may be dispersed in widely separated pages and hidden in different structures such as tables and lists (e.g., Company Annual Reports).
- Businesses usually have to deal with diverse types of documents from multiple domains and functions where the documents have widely varying structure/format
- Users often have business questions that require side-by-side analysis of multiple documents (e.g., multiple tender/RFC responses)
The current set of tools in the market, categorised variously as IDP (Intelligent Document Processing), TPA (Text Process Automation), or RPA (Robotic Process Automation), cannot address these large and/or complex document centric problems; they mostly deal with small and simple documents where the focus is on automation of processing (to whatever extent possible) rather than performing an in-depth analysis of the document for decision making.
Due to the above scenario, organisations are forced to spend exorbitant amount of time and effort in processing their documents for finding their business insights and utilising the results in decision making for use as inputs to other business processes.
What is required is a need to move beyond simple document processing to a deeper level of document comprehension or understanding based on deep domain knowledge of the business. This is the kind of knowledge that cannot be learned by language models by feeding them with large corpus of documents; rather, this knowledge lies with the business users.
It is also important to remember the fact that business requirements change constantly in this era tumultous business conditions where “change is the only constant”. Hence, the document understanding requirements also keep on changing all the time to support the new insights required from documents. Thus, any automation effort in this knowledge intensive domain requires the presence of human supervision and oversight, termed “human-in-the-loop”.
As NuSafeX, a Swiss company, we provide IT solutions to increase the safety of the
civilian nuclear industry. We propose in particular intelligent and efficient processing
of nuclear safety reports and documentation as SaaS, to accelerate and improve the
current repetitive, time consuming and costly manual process of analysing these
documents and decision taking. Upon searching, we called for a few proposals for
intelligent document understanding using AI & ML. This would have to be a cost-
effective automation solution with reasonably high accuracy and speed of data
We had the pleasure to engage with Aunwesha Knowledge Technologies Pvt Ltd,
a niche player in IT & ITES space, for our requirements. They have developed an
amazing platform based on ML and NLP latest technologies for machine deep
understanding of documents.
We have used their platform LearnITy Knowledge Engine for the purposes of
search and extraction of unstructured data from different digital assets as well as for
comparison and validation checks. Their deep document analysis and customised
reporting capabilities are truly exemplary.
We continue to collaborate with Aunwesha Knowledge Technologies for various
other custom requirements and overseas projects as we found the team to be
capable of driving successfully such complex work.
Frédéric DEGUILLAUME, PhD
Co-founder & CTO