Analysis Of 12,000 Documents And 2,000 Emails For A Law Firm

Client CountryČeská republika

  • Client typeSME
  • IndustryLegal & Justice Systems
  • Application areasLegal, Compliance & Risk
  • AI technologiesGenerative AI, Natural Language Processing (NLP)
  • Business impactsEmployee Enablement & Productivity, Operational Efficiency & Cost Savings
  • Data typesDocuments / Semi-structured Data, Image Data, Textual Data
  • Delivery modelsProduct / Licensed Software
  • DeploymentsNo Deployment
  • Key capabilitiesConversational & Language Interaction
  • Project stagesProof of Concept (prototype / PoC / pilot)
  • Solution formsAnalysis, Recommendation, or Report

Solution Description

Problem description

The assignment was clear: search for dozens of word combinations in thousands of documents and emails — all on a limited budget. At first, we got swept up by the hype and tried ChatGPT. After all, what else would come to mind when AI buzzwords are popping out even of your fridge? But soon we realized this solution was more like catching sardines with a shark net, so we chose a more conservative — and above all more effective — approach.

Solution

We processed over 12,000 documents and 2,000 email threads to find keywords and reduce the number of documents requiring manual review. Text was uploaded into a database and queried using SQL (BigQuery). Scanned PDFs had to be converted with OCR, and large documents were split into smaller parts. Each searched phrase had variations due to synonyms, leading to dozens of combinations. We implemented two levels of search — general and detailed. The general layer helped identify misleading trails, while the detailed one refined results where too many matches appeared.

Main Users of the Solution

Junior lawyers, Attorneys

Technologies used

Document AI, Google Cloud Databases, Vertex AI, Google Workspace, ML & ML APIs, Productivity

Use of Personal / Regulated Data

Yes

Implementation

Project Owner on the Client's Side

C-level executives

Form of Supplier Involvement

Full implementation

Impact and Results

Qualitative Benefits

The final output was a simple table where each variant of the word combination showed the number of matches and a list of specific documents to review. Since all documents were converted into plain text, the exact position of a word within a document could then be found using classic full-text search.

Quantitative Results

Our solution saved the client hundreds of hours of tedious work without guaranteed results.

Lessons Learned and Recommendations

Key Success Factors

For projects like this, it is crucial to anticipate what type of output will be most useful for the client. Sometimes you don’t need the trendiest technology, but the most reliable solution — in this case, a table as a clear entry point to the findings.

Recommendation for Others

Do not rely on the first, hype-driven solution — seek the one that is truly most effective.

Promotion

Stay informed with CNAIP. Subscribe to our regular mediamonitor and never miss an update in the world of AI. We’ll deliver a digest of the most essential news straight to your inbox.

By subscribing, you agree to our Terms of Service.

© cnaip 2026

Want to become a part of Czech AI?

Share your story and showcase what you can achieve with artificial intelligence. Your involvement will inspire others and help us map out the Czech AI scene in its entirety.