Automating contract intelligence with Doczy.ai™ on AWS

来源: AWS 架构

#☁️ 基础设施
原文

Extracting actionable insights from thousands of contracts and legal documents remains a challenge. For organizations, critical business information is locked in unstructured documents such as contracts, legal agreements, provider arrangements, and vendor invoices. Extracting and operationalizing this information has traditionally been a manual, error-prone, and resource-intensive process. This leads to missed savings opportunities, costly delays, and significant inefficiencies across the enterprise.

AArete, a global management and technology consulting firm specializing in healthcare, recognized this challenge and developed Doczy.ai™, an intelligent contract interpretation solution powered by generative AI on Amazon Web Services (AWS).

In this post, we show you how Doczy.ai™ uses generative AI on AWS to automate contract intelligence at scale, transforming unstructured documents into structured, actionable insights, so organizations can automate critical business processes and unlock the full value of their data.

The challenge: Data trapped in documents

For healthcare organizations, managing and interpreting contracts and documents represents a major operational bottleneck. Manual review processes require deploying teams to extract data from thousands of documents. This is an approach that is neither scalable nor sustainable, highly prone to error, and costly. Organizations relying on institutional knowledge face additional risks: critical information resides with a few key individuals, creating knowledge silos and succession planning challenges. Existing Contract Lifecycle Management (CLM) systems often prove inadequate for capturing the nuanced and complex terms unique to each agreement. These legacy systems can only configure predefined fields, missing the rich detail and contextual information that distinguishes contracts. The downstream impact is substantial: in healthcare, reimbursement terms must be manually translated into claims systems—a slow, error-prone process. Similarly, verifying vendor invoices against contract terms often requires manual effort, leading to payment processing delays and missed contractual savings opportunities. These inefficiencies ultimately leave significant value on the table.

This is where Doczy.ai™ provides significant value.

Doczy.ai™: An intelligent contract interpretation solution

Doczy.ai™ directly addresses these challenges using advanced AI and scalability on AWS. Developed by AArete, Doczy.ai™ pushes the boundaries of document intelligence. The solution automatically interprets complex documents and converts them into a structured, queryable information repository that allows organizations to unlock the full value of their data and drive smarter decisions.The evolution of Doczy.ai™ reflects rapid AI advancement. Prior to 2020, document processing required manual effort, with individuals processing approximately 100 documents per week. Between 2020–2023, the firm implemented rules-based contract processing, achieving approximately 55% accuracy. The breakthrough came in 2024 with an AI-based processing built on AWS achieved 99% accuracy—a dramatic improvement over the 55% accuracy of traditional rules-based systems.

Doczy.ai™ architecture

Doczy.ai™ is built on a comprehensive AWS architecture designed to handle the entire document processing lifecycle: from the moment a file enters the system to the moment it generates actionable business intelligence.

Doczy.ai is built on a comprehensive AWS architecture designed to handle the entire document processing lifecycle: from the moment a file enters the system to the moment it generates actionable business intelligence.

Architecture of Doczy.ai™

External users access the platform through a secure Next.js frontend, with Amazon Cognito managing authentication and authorization behind the scenes. After authentication, users upload documents directly to Amazon Simple Storage Service (Amazon S3), where durable, scalable object storage ensures nothing is lost and everything is accessible at scale. From there, the real intelligence begins.

An AWS Lambda function triggers Amazon Textract to extract text and metadata from documents in various formats. What sets Doczy.ai™ apart at this stage is its patented “smart chunking” algorithm, a proprietary approach that goes far beyond pulling words off a page. Rather than treating a document as a flat sequence of text, smart chunking preserves hierarchical structure and one-to-many relationships within documents. It uses a combination of semantic and keyword search to decompose text into meaningful, context-aware chunks, applying dynamic parameters to maintain logical relationships throughout. Sequential identifiers and metadata-driven grouping organize these chunks into field groups, detecting overlaps and removing duplications while keeping the document’s natural flow intact.

After chunking, the document enters the dual clustering engine of Doczy.ai™. This two-lens methodology analyzes every contract simultaneously from both a semantic and a structural perspective. On the semantic side, extracted text is converted into embeddings, numerical representations of meaning, and similar ideas are grouped together even when they’re expressed in different words. On the structural side, pattern-recognition algorithms identify clause types, formatting conventions, table layouts, and hierarchical organization, understanding. For example, that a three-nested-level exhibit carries fundamentally different implications than a straightforward attached schedule.These two analyses don’t operate in isolation. Projection algorithms compare the semantic and structural clusters side by side, synthesizing them into a unified, enriched document model that captures both meaning and context. It’s this convergence that drives the 99% accuracy rate of Doczy.ai™. The system doesn’t just read the words, it understands the contract. Advanced large language models (LLMs) then generate structured output grounded in this dual-clustered intelligence.Before output is finalized, the system determines each document’s file class and generates prompts tailored to the extracted text, cluster classification, and domain context. Through few-shot and multi-shot prompting, the platform continuously edits the prompt on domain-specific examples and based on real outputs, creating a feedback loop that compounds accuracy improvements over time.

The resulting structured data flows into Snowflake, forming a centralized repository that powers intelligent dashboards with actionable insights and visualizations. Throughout the entire pipeline, Amazon CloudWatch monitors performance in real time and proactively surfaces issues before they escalate, while AWS Secrets Manager safeguards sensitive information, ensuring that security is not an afterthought, but a foundational layer woven into every stage of the system.

The transformative impact of Doczy.ai™

The results of this AI-powered approach are transformative and measurable. By automating contract interpretation and document processing, Doczy.ai™ has demonstrated significant impact at scale for multiple organizations across healthcare and financial services. The scale of operations over the last 22 months demonstrates the maturity and production readiness of Doczy.ai™. This solution has processed 2.5 million contract documents (50 million pages) with 137 million API calls to Amazon Bedrock and 442 billion tokens—a level of automation and accuracy previously unattainable through manual or traditional document processing approaches. Over this same period, Doczy.ai™ has helped clients achieve approximately 330 million dollars in cumulative direct and indirect savings.The 99% accuracy rate represents significant improvement over the approximately 55% accuracy of rules-based systems and far exceeds manual processing, which is typically affected by fatigue and human error. The 97% reduction in manual processing time translates directly to cost savings and enables organizations to reallocate human resources to higher-value activities that require judgment and strategic thinking.

A use case in action: Business process automation for health plans

For health plans, Doczy.ai™ provides a powerful solution to automate and improve contract management across the entire lifecycle. It ingests existing contracts in both paper and digital formats, integrates with contract management systems such as Coupa and Icertis, and processes new contracts and amendments as they’re executed. It then creates a centralized metadata repository that feeds directly into downstream systems, enabling end-to-end business process automation.This automation unlocks critical capabilities: Organizations can continuously analyze and improve contract terms, identifying opportunities to improve financial performance and operational efficiency. The architecture feeds accurate, up-to-date contract data directly into claims systems, automating the configuration process that previously required manual translation of reimbursement terms and removing manual data entry, configuration errors, and delays. Additionally, the platform helps maintain claim payment accuracy by assessing payments against contract terms, identifying discrepancies, and flagging potential overpayments or underpayments before they occur.By automating manual processes, health plans can adapt quickly to new contract terms and regulatory requirements. The intelligent dashboards and actionable insights provided by Doczy.ai™ enable decision-makers to understand contract performance, identify trends, and take proactive action to optimize financial outcomes.

Getting started with Doczy.ai™

Organizations interested in using Doczy.ai™ to transform document processing and contract management can engage with AArete to discuss their specific use cases and requirements. AArete offers the platform as a Software as a Service (SaaS) solution, enabling rapid deployment without significant infrastructure investment. AArete’s team of experts will configure this solution for your specific document types, domain terminology, and business processes, supporting maximum value from day one.

Conclusion

The challenge of unlocking data from unstructured documents is a major hurdle for many businesses, particularly in healthcare and financial services where contracts and agreements govern critical operational and financial relationships. By embracing intelligent document intelligence on AWS, organizations can solve this long-standing operational challenge and unlock a new frontier of strategic advantage, turning their data into their most valuable asset.

Built on a sophisticated architecture that orchestrates Amazon Cognito, Amazon S3, AWS Lambda, Amazon Textract, Amazon Elastic Container Service (Amazon ECS), Amazon Bedrock, Amazon CloudWatch, and AWS Secrets Manager, Doczy.ai™ demonstrates how modern cloud services can solve complex document-heavy business problems. Its advanced hybrid smart chunking, dual clustering, and prompt optimization techniques form the core of a patented contract intelligence engine.

Doczy.ai™ delivers tangible impact, processing up to 250,000 contract documents per week with 99% accuracy, reducing manual processing time by 97%, and helping clients unlock roughly 330 million dollars in cumulative savings over 22 months. By embracing this intelligent document processing, organizations can turn contracts into a strategic data asset, improving efficiency, accuracy, and profitability while freeing teams to focus on higher-value work.

To learn more about how AArete and Doczy.ai™ can help your organization transform document processing and unlock the value of your unstructured data, visit the AArete website.


About the authors