By Stephane Boghossian, CEO of HAQQ Legal AI
Published: April 1, 2026 | Reading time: ~25 min
TL;DR: So I stumbled onto a legal ontology built on Dynamic Interfaces' platform and it kind of broke my brain. A team in Mexico replaced 300 MCP tools with 7 ontology-aware ones. Per-message cost dropped from $0.60 to $0.02. Meanwhile, Harvey AI just raised at $11B and charges $1,200/lawyer/month -- for RAG. Stanford researchers found that production legal RAG tools hallucinate 17-33% of the time. The trick isn't better prompts or cheaper models -- it's modeling law the way law actually works: structured, versioned, cross-referenced. We're now building this for UAE labor law at HAQQ. I'm genuinely obsessed.
Key Takeaways:
Table of Contents:
A few weeks ago, I got on a call with the CEO of Dynamic Interfaces to look at something called Sentencia -- a legal ontology for Mexican labor law. I figured I'd see a demo, take some notes, move on.
That's not what happened.
I've been building legal AI at HAQQ for the MENA region, and I've sat through enough "revolutionary" demos to last a lifetime. Most of them are just RAG with a nicer UI. Sentencia looked different from the first five minutes. Not because of slick design or marketing speak -- because of what was happening under the hood.
Here's the thing that stopped me cold: 5 Mexican government customers were using this system daily. Court-appointed expert witnesses -- peritos -- were querying labor law across four federal statutes, getting precise answers with full legal citations, and the whole thing cost two cents per message. Not two dollars. Two cents.
Holy shit.
For context, Harvey AI -- the $11B golden child of legal AI -- charges $1,200 per lawyer per month. CoCounsel starts at $220/month. Even the cheapest seat in legal AI runs $100+/month. And here was this system in Mexico doing it for two cents per message. No subscriptions. No seat minimums. Just a structured knowledge graph and 7 well-designed tools.
I spent the next two weeks pulling Sentencia apart to understand why it works. This article is what I found, why it matters for anyone building legal AI, and what we're building at HAQQ because of it.
Before I get into the ontology architecture, I need to say something blunt about the current state of legal AI. Because the more I dug into the competitive landscape, the more a pattern emerged -- and it's not flattering for the incumbents.
Every major legal AI company is a RAG wrapper. Not one has a formal legal ontology.
Let me be specific.
Harvey AI -- $11B valuation, $1.2B raised, backed by Sequoia and GIC -- runs fine-tuned LLMs with RAG over legal databases. They charge ~$1,200/lawyer/month at list price ($100-500 after enterprise discounts, 20-seat minimum, $288K annual floor). They just announced a LexisNexis integration, adding another $400-600/lawyer/year. They claim 91% accuracy on their "BigLaw Bench." That still means 9% of legal work contains errors. In a profession where a single wrong citation gets you sanctioned.
CoCounsel (Thomson Reuters) -- 1 million users, bolted onto Westlaw's 100+ years of case law. Multi-model architecture across Anthropic, OpenAI, and Google. Pricing from $220 to $500/user/month. Better data moat than Harvey. But still RAG at its core -- federated search with AI summarization layered on top.
Legora (formerly Leya) -- $5.55B valuation, 800 law firms. Built on Claude with agentic workflows. $250/user/month, 10-seat minimum. No proprietary legal knowledge structure. It's a very well-designed wrapper.
Now here's the number that should make every legal AI founder lose sleep.
Turns out Stanford ran a preregistered empirical study on this -- the first of its kind. Magesh et al., published in the Journal of Empirical Legal Studies in 2025. They tested production legal RAG tools and found hallucination rates of 17-33% across the board:
| Tool | Hallucination Rate | Source |
|---|---|---|
| GPT-4 (general purpose) | 58% | Magesh et al., JELS 2025 |
| Llama 2 (general purpose) | 88% | Magesh et al., JELS 2025 |
| Westlaw AI-Assisted Research | 33% | Magesh et al., JELS 2025 |
| Lexis+ AI | 17%+ | Magesh et al., JELS 2025 |
| Ask Practical Law AI | 17%+ | Magesh et al., JELS 2025 |
The Stanford team's conclusion: RAG reduces hallucinations versus general-purpose models, but hallucinations remain "substantial, wide-ranging, and potentially insidious." Legal AI providers' claims of "hallucination-free" citations are demonstrably overstated.
So the companies charging $1,200/month are shipping tools that get it wrong up to a third of the time. And none of them have the architectural foundation to fix it -- because the problem isn't the model. It's the retrieval architecture.
Meanwhile, in an entirely different domain -- clinical medicine -- researchers published a paper showing that ontology-grounded GraphRAG hit 98% accuracy versus ChatGPT-4's 37%. That's not a typo. Ontology-grounded hallucination rate: 1.7%. ChatGPT-4 hallucination rate: 63%. A 61-percentage-point improvement, published in the Journal of Biomedical Informatics, using SNOMED CT (the medical ontology standard) as the grounding layer.
The medical domain proved it. The legal domain needs it. And nobody's building it.
That's the gap. That's what HAQQ is walking into.
I want to be upfront about what this was. Not a product review. Not a partnership announcement. This was me connecting to Dynamic Interfaces' MCP server, exploring Sentencia's data structures, analyzing the ontology design, and stress-testing it against everything I know about legal reasoning.
The Model Context Protocol (MCP) -- Anthropic's standard for AI-tool integration, now governed by the Linux Foundation -- was the interface. Every action in the Sentencia ontology is exposed as a callable MCP tool. Any MCP-compatible client can plug in. That alone is interesting, but the real story is what happens when you constrain an AI agent to operate within a well-defined ontology instead of throwing 300 generic tools at it.
The Model Context Protocol (MCP) is an open standard introduced by Anthropic in November 2024 for connecting AI models to external tools and data sources. Now governed by the Linux Foundation's Agentic AI Foundation, MCP defines a JSON-RPC interface through which AI agents can discover and invoke tools. As of early 2026, over 10,000 active public MCP servers are registered, with 97 million monthly SDK downloads. The protocol has been adopted by OpenAI, Google DeepMind, and major enterprise platforms.
Their CEO put it plainly: "Ontologies are kind of the secret."
Coming from a guy whose entire platform is built around generating infrastructure from ontology definitions, sure -- that sounds self-serving. But after digging into the data, I think he's right. And I think this is the same insight that made Palantir a $100B+ company.
Turns out there's academic backing for this too. A 2025 paper on tool selection found that reducing tool count tripled accuracy -- from 13.6% to 43.1% -- while cutting prompt tokens by over 50% (RAG-MCP, arXiv:2505.03275). Fewer tools, dramatically better performance. That's exactly what the ontology does: collapses hundreds of granular database operations into a handful of semantically meaningful legal operations.
I'll be blunt. Most legal AI products -- including most of what exists in the MENA market -- are doing RAG over PDFs. They chunk legal documents, embed them in a vector database, and retrieve semantically similar passages when you ask a question. This works for general knowledge queries. It fails catastrophically for law.
RAG-based legal AI fails in three specific ways: temporal blindness (retrieving wrong versions of amended law), structural ignorance (losing legislative hierarchy during chunking), and cross-reference amnesia (inability to follow typed relationships between provisions).
The consequences aren't theoretical. At least 6 attorneys have been sanctioned for filing AI-generated fake case citations since 2023 -- starting with the now-infamous Mata v. Avianca case where ChatGPT fabricated cases and then confirmed they "indeed exist" in Westlaw. In 2025, a large, well-regarded law firm got hit in Johnson v. Dunn. This is an ongoing crisis, not a one-off.
I keep running into the same three failure modes.
RAG systems can't inherently distinguish between the 2022 version and the 2023 version of a legal article. A semantic search for "vacation days Mexico" might pull back the pre-2023 text (6 days minimum) instead of the post-2023 text (12 days minimum). In casual conversation, being off by 2x is embarrassing. In a legal calculation submitted to a court, it invalidates the entire document.
Mexico's 2023 "Vacaciones Dignas" reform doubled vacation entitlements overnight. A perito calculating a wrongful dismissal case spanning that boundary needs both versions -- the old table for pre-2023 years, the new table for post-2023. RAG has no mechanism for this. None.
An EMNLP 2025 survey of RAG-reasoning systems confirmed this as a systemic issue: "temporal blindness" is one of the identified failure modes, alongside "mode-switch fragility" where models actually perform worse with full retrieval sets than without any documents at all.
Law is hierarchically structured: Constitution, Federal Law, Regulations, Circulars. Within a law: Titles, Chapters, Articles, Fractions, Paragraphs. RAG chunking destroys this hierarchy. When Article 50 says "in the terms of Article 48," a RAG system may retrieve Article 50 without Article 48, producing an incomplete answer. Worse -- it doesn't know what it's missing.
Legal reasoning is inherently graph-based. A seniority premium calculation under Article 162 of Mexico's Federal Labor Law references UMA values defined in the Constitution. INFONAVIT housing credits are calculated in UMAs per a 2016 reform. The retirement savings law depends on the social security law's contribution definitions. A RAG system has no mechanism to follow these reference chains. It retrieves fragments. An ontology traverses connections.
A legal knowledge graph is a network of legal provisions connected by typed, semantic relationships. Unlike a document index where connections are inferred by keyword similarity, a legal knowledge graph explicitly encodes that Article 50 of Mexico's Federal Labor Law "establishes a formula" referenced by Article 48, which in turn "refers to" Article 84's salary definition. Each edge carries a relationship type -- refers_to, complements, creates_exception, establishes_formula, establishes_procedure, modifies, defines_term -- enabling an AI agent to traverse legal logic deterministically.
A 2025 paper from arXiv -- "An Ontology-Driven Graph RAG for Legal Norms" (SAT-Graph RAG, arXiv:2505.00039) -- validated exactly this. The researchers found that standard flat-text retrieval is "blind to the hierarchical, diachronic, and causal structure of law, leading to anachronistic and unreliable answers." Their solution -- grounding a knowledge graph in a formal legal ontology with temporal versioning and causal event nodes -- mirrors what Sentencia built. Applied to the Brazilian Constitution, it demonstrated verifiable, temporally-correct answers with drastically reduced factual errors.
I read that paper after my Sentencia deep-dive and got chills. Independent validation from researchers who'd never seen the system.
HTML_BLOCK_0
Here's what Sentencia actually looks like under the hood.
A legal ontology is a structured, machine-readable knowledge model that represents a legal domain as a graph of entities (laws, articles, courts, computed values), their hierarchical relationships, and typed cross-references between provisions. Unlike flat-text retrieval systems, a legal ontology preserves the inherent structure of legislation -- hierarchy, versioning, exceptions, and formula chains -- enabling deterministic legal reasoning rather than probabilistic document retrieval.
Scale: 11 entity types, 1,689 articles, 4 federal laws, 7 judicial precedents, 12 typed cross-references, 10 enum taxonomies.
The Sentencia legal ontology models 1,689 articles across 4 Mexican federal labor laws with 11 entity types, 7 typed cross-reference categories, and 10 enum taxonomies.
The four laws form a closed system around the Mexican employer-worker relationship:
| Law | Abbreviation | Articles | What It Governs |
|---|---|---|---|
| Ley Federal del Trabajo | LFT | 1,076 | Employment, rights, disputes, procedures |
| Ley del Seguro Social | LSS | 373 | Social security: health, disability, pensions |
| Ley de los Sistemas de Ahorro para el Retiro | LSAR | 146 | Retirement savings, AFORE accounts |
| INFONAVIT Law | INFONAVIT | 94 | National housing fund, mortgage credits |
The hierarchy flows naturally from the law itself. Click any law below to explore how it breaks down:
HTML_BLOCK_1
But hierarchy alone isn't the interesting part. What makes Sentencia architecturally powerful is the typed cross-reference system. In most legal AI systems, a reference between two articles is just a hyperlink -- "see also Article 48." In Sentencia, it's a typed relationship with semantic meaning:
| Reference Type | Meaning | Why It Matters |
|---|---|---|
remite_a |
Refers to | Basic dependency chain |
complementa |
Complements | Expands scope across laws |
modifica |
Modifies | Tracks reform history |
excepciona |
Creates exception | Prevents over-application of general rules |
define_termino |
Defines term | Links usage to legal definition |
establece_formula |
Establishes formula | Points to calculation methods |
establece_procedimiento |
Establishes procedure | Maps how to exercise a right |
That excepciona type -- that's the one that keeps me up
at night. Article 5 of the LFT establishes general labor rights, but
special work regimes -- domestic workers, athletes, digital platform
workers -- create exceptions. If your system doesn't model
exceptions explicitly, it applies general rules where special rules
should apply. That's not a rounding error. That's a legally
catastrophic mistake that could cost someone their case.
The key insight -- and one that an independent 2025 arXiv paper confirmed -- is that law already has an ontology. It's already structured into hierarchies, cross-referenced with typed relationships, and versioned through reforms. Sentencia doesn't impose structure on unstructured text. It captures the structure that already exists in the law itself.
That was my "oh" moment. We've been trying to make AI understand law by throwing text at it. But law isn't text. Law is a graph.
This is where it goes from "architecturally interesting" to "holy shit, this changes the business model."
A legal ontology reduced AI reasoning costs from $0.60 to $0.02 per message -- a 97% reduction -- by collapsing 300 MCP tool descriptions into 7 ontology-aware tools.
Here's the before/after in a clean comparison:
| Metric | Before Ontology (RAG) | After Ontology | Improvement |
|---|---|---|---|
| Tools per request | ~300 | 7 | 97.7% fewer |
| Tokens for tool descriptions | ~90,000 | ~2,100 | 97.7% fewer |
| Cost per message | $0.60 | $0.02 | 96.7% cheaper |
| Model requirement | Frontier (GPT-4 class) | Open-source | ~30x cheaper |
| Answer traceability | Probabilistic chunks | Deterministic graph traversal | Full citation chain |
To put this in competitive context:
| Product | Cost per Interaction | Annual Cost (200 queries/mo) |
|---|---|---|
| Harvey AI | $2.40-$6.00 | $14,400+ |
| CoCounsel All Access | $1.00-$2.50 | $6,000 |
| Legora | $1.00-$1.25 | $3,000 |
| Sentencia (ontology) | $0.02 | $48 |
That's not a different price tier. That's a different universe.
HTML_BLOCK_2
The standard approach to building AI agents is to expose every database operation as a separate tool. Get article by number. Get article by topic. Get articles by law. Get article text. Get article formula. Search articles by keyword. Get article reforms. Get article at date. Multiply that across 11 entity types and you hit 300 tools fast.
Here's the problem most people miss: every one of those tool descriptions gets serialized into the LLM's context window on every single request. A single large MCP server can consume 10,000-17,000+ tokens of context just for tool descriptions. With 300 tools at roughly 300 tokens each, you're burning 90,000 tokens before the user even asks a question.
At Claude Opus pricing ($15 per million input tokens, $75 per million output tokens as of March 2026), that's roughly $1.35 per request just for tool descriptions. Add the actual conversation and you land around $0.60 per message.
Turns out this isn't just a cost problem -- it's an accuracy problem too. A 2025 paper (RAG-MCP, arXiv:2505.03275) found that baseline tool selection accuracy with all tools in context was a dismal 13.62%. By using RAG to retrieve only relevant tools, accuracy jumped to 43.13% -- a 3.17x improvement. Prompt tokens dropped by over 50%. The platforms figured this out empirically: OpenAI hard-caps at 128 tools per agent, and Cursor limits MCP tools to 40, explicitly to prevent "flooding the agent's context window."
The ontology collapses this. Instead of 300 granular database operations, you get 7 semantically meaningful legal operations:
query_legal_knowledge -- structured query across the
entire ontology
calculate_settlement -- applies formulas from articles +
wage tables
find_jurisprudencia -- retrieves relevant case law with
article links
get_cross_references -- traverses the ReferenciaLegal
graph
get_wage_data -- SalarioMinimo + UMA for any year/zone
get_vacation_table -- correct regime for any dategenerate_dictamen -- template-based document generation
Seven tools. 2,100 tokens for descriptions. $0.02 per message.
I stared at those numbers for a long time. This isn't incremental improvement. This is a different cost universe.
Ontology-based legal AI enables a shift from frontier models costing $15 per million input tokens to open-source models at $0.50 per million, because the simplified 7-tool interface requires less reasoning capability.
When an AI agent only has 7 well-defined tools to choose from -- instead of 300 -- the model doesn't need to be as smart. The ontology does the heavy structural lifting. The model just needs to understand the user's question, pick the right 1-3 tools, and synthesize a response from structured data. That's a much simpler task than reasoning about which of 300 tools to chain together.
Anthropic's own engineering team demonstrated the principle: swapping direct MCP calls for a code-execution approach collapsed a 150,000-token workflow to ~2,000 tokens -- a 98% reduction. The ontology achieves the same compression through domain modeling rather than code generation.
HTML_BLOCK_3
The cost math hits different when you look at who actually uses this system. A perito laboral in Mexico City earns roughly 15,000-30,000 MXN per month -- about $830 to $1,670 USD. At $0.60 per message, with 50 cases per month and 20 messages per case, the API cost alone would be $600/month. That's 36-72% of their income.
Completely non-viable. Dead on arrival.
At $0.02 per message, the same usage costs $20/month. That's 1.2-2.4% of income -- less than a Netflix subscription. The cost reduction doesn't just make the product cheaper. It makes an entirely new market possible. These are people who literally couldn't afford legal AI before.
And there's a compounding effect here that I find genuinely exciting: lower cost per message means more messages per case. More messages means more thorough analysis. More thorough analysis means better legal documents. Better documents mean more wins, more reputation, more referrals. The cost reduction doesn't just affect price -- it affects quality. It changes the product itself.
Compare that to Harvey's model: $1,200/lawyer/month with a 20-seat minimum means $288,000/year before a single query runs. That's designed for Am Law 100 firms with 100,000+ lawyers. It structurally excludes the solo practitioners, small firms, and expert witnesses who handle the bulk of labor law globally.
Let me trace through a real example -- not a theoretical one. This is a complete wrongful dismissal calculation using actual LFT articles, real 2026 wage data, and the full cross-law dependency chain. This is what convinced me the architecture produces better answers than RAG -- not theoretically, but in practice.
The case: Maria Elena Gutierrez, administrative assistant at a manufacturing company in Guadalajara. Monthly salary: $15,000 MXN ($500/day). Employed 5 years, 3 months (January 2021 to April 2026). Terminated without just cause -- employer claims "restructuring" but provides no written notice per Art. 47 LFT.
Here's how the ontology handles it -- and what RAG would miss.
Step 1: Classify the dispute. The system identifies
despido injustificado (wrongful dismissal) from the query.
This isn't semantic similarity -- it's an exact match against
the ArticuloTema enum. No fuzzy retrieval. No "close
enough." Critically, Art. 47 LFT requires the employer to deliver
written notice stating specific conduct and dates. Failure to
provide written notice makes the dismissal unjustified automatically.
The ontology knows this because the excepciona relationship
is modeled.
Step 2: Check binding precedent. The system retrieves
Tesis 2a./J. 53/2017 -- binding jurisprudencia from the
SCJN's Second Chamber. This tesis establishes that when an employer
denies dismissal and offers reinstatement, the tribunal must evaluate
whether the offer is in good or bad faith. If bad faith (e.g., offering
reinstatement under worse conditions), the burden of proof stays on the
employer. Per Art. 784 LFT, the employer bears the burden of proof for
working conditions, attendance, seniority, wages, and more. RAG might
retrieve this tesis -- or it might not. The ontology always does,
because it's linked to tema: despido with a typed
interprets relationship.
Step 3: Follow the formula chain and compute everything.
Here's the full calculation -- every line traceable to a specific article:
| Concept | Legal Basis | Calculation | Amount (MXN) |
|---|---|---|---|
| Indemnizacion constitucional | Art. 48 LFT | 90 days x $500/day | $45,000.00 |
| 20 dias por ano de servicio | Art. 50 LFT | 20 x 5.25 years x $500 | $52,500.00 |
| Prima de antiguedad | Art. 162 LFT | 12 x 5.25 years x $500 | $31,500.00 |
| Subtotal | $129,000.00 |
| Concept | Legal Basis | Calculation | Amount (MXN) |
|---|---|---|---|
| Proportional aguinaldo | Art. 87 LFT | 15 days x (3/12) x $500 | $1,875.00 |
| Proportional vacation | Art. 76 LFT (post-2023) | 20 days x (3/12) x $500 | $2,500.00 |
| Prima vacacional | Art. 80 LFT | 25% of $2,500 | $625.00 |
| Subtotal | $5,000.00 |
| Concept | Legal Basis | Calculation | Amount (MXN) |
|---|---|---|---|
| Salarios vencidos (8 months) | Art. 48 LFT | 8 months x $15,000 | $120,000.00 |
| (Capped at 12 months max) |
| Check | Legal Basis | What It Covers |
|---|---|---|
| IMSS contribution audit | Arts. 27-28 LSS | Was salary correctly registered? SBC cap = 25 UMAs = $2,932.75/day |
| INFONAVIT contributions | Art. 29 Ley INFONAVIT | 5% employer contribution to housing subaccount |
| AFORE impact | LSAR | Unemployment withdrawal rights after 46 days |
| Component | Amount (MXN) |
|---|---|
| Indemnizacion | $129,000.00 |
| Finiquito | $5,000.00 |
| Salarios vencidos (8 months) | $120,000.00 |
| GRAND TOTAL | $254,000.00 |
That's $254,000 MXN -- roughly $14,100 USD -- computed deterministically from 8 articles across 4 laws. Every number traces to a specific article, in a specific version, via a named cross-reference.
A RAG system asked this question would retrieve some of these articles, miss others, and hallucinate the amounts it couldn't find. Turns out Stanford quantified exactly how often: 17-33% of the time. On a $254,000 calculation, a 17% error rate means the answer could be off by $43,000. That's not an approximation. That's malpractice.
The ontology computes this deterministically. Every time. Zero hallucination on the numbers -- because the numbers come from structured data, not generated text.
HTML_BLOCK_4
This deserves its own section because it's the single most common error in Mexican labor calculations -- and a perfect illustration of why structured knowledge representation matters.
Since Mexico's 2016 constitutional reform, two reference units coexist: the UMA (Unidad de Medida y Actualizacion) and the Salario Minimo (minimum wage). They sound similar. They are not. And the gap between them has been widening every year:
| Year | Salario Minimo (daily) | UMA (daily) | Gap |
|---|---|---|---|
| 2017 | $80.04 | $75.49 | 1.06x |
| 2020 | $123.22 | $86.88 | 1.42x |
| 2023 | $207.44 | $103.74 | 2.00x |
| 2025 | $278.80 | $113.14 | 2.46x |
| 2026 | $315.04 | $117.31 | 2.69x |
In 2026, the minimum wage is 2.69 times the UMA. Using the wrong one doesn't produce a rounding error. It produces a legally invalid calculation.
Here's where it gets treacherous. Different legal provisions reference different units:
If you calculate the prima de antiguedad cap using UMA instead of Salario Minimo, you get $234.62/day instead of $630.08/day -- shortchanging the worker by 62.8%. If you calculate an IMSS contribution cap using Salario Minimo instead of UMA, you overpay by 169%.
An ontology encodes which reference unit applies to which legal concept.
It's not ambiguous. It's not up for interpretation. The
define_termino cross-reference links each provision to the
correct unit. A RAG system retrieves text about both UMA and Salario
Minimo and leaves it to the LLM to figure out which one applies.
That's where the hallucination lives.
This is the kind of mistake that peritos laborales -- the expert witnesses who prepare dictamenes periciales for courts -- flag as the #1 most common error since 2017. An ontology eliminates it structurally.
This is the question I kept circling back to. Sentencia works for Mexican labor law. Does it generalize?
I looked at six jurisdictions. The answer is yes -- with caveats. Every legal system shares five structural primitives: hierarchical legislation, semantic categorization, cross-reference graphs, temporal versioning, and computed domain values -- making the ontology pattern universally replicable. The pattern is universal. The enums are jurisdiction-specific.
The pattern maps most cleanly onto civil law systems -- France, Brazil, Saudi Arabia, UAE mainland -- because they share codified, hierarchical structures nearly identical to Mexican law. Common law systems (US, UK) need an adaptation layer that elevates case law to first-class status alongside statutes. Sharia-influenced systems (Saudi Arabia, UAE) need a third dimension: the religious/jurisprudential source hierarchy.
Having lived in both Paris and Dubai, I can tell you -- I've seen these legal systems from the inside, both as a founder and as someone who's had to deal with employment law across jurisdictions. The structural similarities are real.
Brazil's CLT (Consolidacao das Leis do Trabalho) would be the easiest port -- structurally almost identical to Mexico's LFT. France is medium complexity because of its dual legislative/regulatory track. The US is the hardest because of federal/state duality and the centrality of binding case law. Saudi Arabia and the UAE sit in the middle, with added complexity from bilingual requirements and, in the UAE's case, a triple-jurisdiction model (Federal + DIFC + ADGM).
But here's what matters: the adaptation is in the enums, not the
architecture. The entity model -- laws, structural units, articles,
cross-references, case law, computed values -- is universal. You rename
entities, adjust taxonomies, and add jurisdiction-specific extensions.
The cross-reference type system generalizes directly, with two additions
(interprets for common law case-to-statute links,
overrides for hierarchy conflicts). The temporal modeling
generalizes completely.
HTML_BLOCK_5
I've been building legal AI for the MENA region for the past two years. After pulling Sentencia apart, I can say with confidence: nobody in MENA is doing this. Not even close.
The competitive landscape in MENA legal AI is dominated by RAG-over-PDFs. Al Tamimi partnered with Harvey. Legora launched Arabic support in January 2026. There are 185+ legal tech companies in the region. But not one of them -- as far as I can find -- has built a Sentencia-style structured ontology with typed cross-reference graphs and computed value tables for MENA labor law.
No Arabic legal ontology exists. Not for labor law, not for commercial law, not for any domain. Zero. That's not a gap -- that's a vacuum.
That's our opening. And the market is real.
| Metric | Value |
|---|---|
| GCC Legal Technology Market | $1.2B |
| UAE Legal Tech (2023) | $114.5M |
| UAE Legal Tech (projected 2030) | $234.4M (CAGR 10.8%) |
| UAE Digital Justice Budget | AED 2.1B ($572M) |
| MEA Legal AI CAGR (2025-2030) | 18% |
The UAE government alone has allocated $572 million for digital transformation across the justice sector. Saudi Arabia's Vision 2030 includes specific legal technology modernization provisions. This isn't speculative -- the budgets are allocated, the mandates are issued.
The closest MENA-native competitor is Qanooni -- a Dubai-based AI legal drafting tool with a $2M pre-seed from Village Global. They're generic. No ontology. No knowledge graph. No structured legal reasoning. They're a wrapper around an LLM with a nice Outlook/Word integration.
That's it. $2M pre-seed, generic RAG. In a $1.2B market.
The UAE's Federal Decree-Law No. 33 of 2021 is roughly 65 articles plus implementing regulations. Saudi Arabia's labor law is about 245 articles. These are manageable corpora -- small enough to build a complete, human-validated ontology in weeks, not months.
UAE Decree-Law No. 33 is actually the ideal starting corpus for a legal ontology. It's modern (2021, amended 2024), well-structured, introduces six flexible employment models (remote, part-time, temporary) with different rules for each, and the temporal versioning challenge is already present thanks to the 2024 amendments. Perfect for proving the ontology pattern works.
The GCC also has high-value, high-frequency computed values that are perfectly suited to structured modeling: end-of-service gratuity (EOSG) formulas differ between UAE and Saudi Arabia. Emiratisation and Saudization quotas change by sector and company size. WPS (Wage Protection System) compliance thresholds matter for every employer. These aren't questions you want an LLM to hallucinate answers to. They need to come from structured, validated data.
Arabic-first is our advantage. Per ArabLegalEval benchmarks, Arabic legal NLP lacks the benchmarking frameworks available for English. RAG approaches that work reasonably well in English degrade significantly in Arabic due to rich morphology, orthographic ambiguity, and a shortage of annotated legal datasets. An ontology sidesteps this entirely -- it provides structured data rather than relying on NLP extraction from Arabic text. The LLM does synthesis, not extraction. That's a much simpler task and one where current models are already good.
And we already have the distribution channel. HAQQ runs on WhatsApp. Just like Sentencia delivers through Hiku on WhatsApp in Mexico, we deliver through WhatsApp in the GCC. The 7-tool MCP pattern means our WhatsApp interactions cost two cents each, not sixty. In a region where short, conversational message patterns dominate, that difference is -- no exaggeration -- the difference between a viable product and a money pit.
Phase 1 (Month 1-2): UAE Federal Labor Law. 65 articles + implementing regulations. EOSG calculator, leave tables, WPS thresholds. Arabic + English bilingual from day one. This is our primary market and the smallest corpus -- highest ROI.
Phase 2 (Month 2-3): Saudi Labor Law. 245 articles + 2024/2025 amendments. Different EOSG formula, Nitaqat quotas. Same language, similar structure -- we leverage everything from Phase 1.
Phase 3 (Month 3-4): DIFC + ADGM. Common law overlay for free zone case law. Completes the UAE picture for international firms operating across all three jurisdictions.
Phase 4 (Month 4-6): GCC expansion. Egypt, Bahrain, Kuwait, Qatar, Oman. Each new jurisdiction gets faster as the framework matures.
I'm still figuring out the exact timeline -- these always slip, any founder will tell you that. But the sequence is right.
One thing that gave me confidence during this research -- we're not inventing the idea that law should be machine-readable. Serious institutions have been working on this for years.
Singapore announced SOLID (Singapore Open Legal Informatics Database) in November 2025 -- a partnership between SMU's Centre for Digital Law and the Ministry of Law to build machine-readable datasets of court decisions, statutes, and legal scholarship, with a public API. Full launch expected Q1 2028.
The EU has been running the European Legislation Identifier (ELI) since 2012, now implemented by 21+ countries. Every EU legal text gets a unique URI, standardized metadata, and machine-readable format in RDFa or JSON-LD. Italy built an entire legislative knowledge graph on Akoma Ntoso, the UN's XML standard for legislative documents.
New Zealand's "Better Rules" initiative, running since 2018, goes furthest -- developing legislation simultaneously in plain language, rule statements, and code. Estonia's CIO called it "the most transformative idea" from international digital government summits.
The pattern is clear: the governments that are investing in computational law now will have the infrastructure for legal AI later. The ones that aren't... will be buying expensive RAG wrappers from Silicon Valley.
MENA is at a crossroads. The UAE's $572M digital justice budget suggests they're ready. The question is whether the legal AI they adopt will be architecturally sound or just another hallucination machine with a nice interface.
For anyone who wants to replicate this pattern -- whether for labor law, tax law, regulatory compliance, or any domain where structured knowledge matters -- here's what I've distilled into a practical process.
For a focused legal domain like a single country's labor law (65-245 articles), a complete ontology can be built in 4-8 weeks at a parsing cost of $5-20 per law.
Step 1: Identify Source Laws and Official Repositories. Find the 3-5 primary statutes, locate the official digital repository (government gazette, law portal), and assess machine-readability. HTML is best, clean PDF needs parsing, scanned PDF needs OCR. For MENA, most government publications are clean PDF -- LlamaParse handles Arabic well at 93-95% accuracy on legal documents ($0.003 per page).
Step 2: Design the Entity Model. Start from the universal template (Law, StructuralUnit, Article, CaseLaw, CrossReference, ComputedValue) and add jurisdiction-specific entities. For GCC: EndOfServiceGratuity, NationalizationQuota, WPSCompliance, ShariaReference.
Step 3: Define Enum Taxonomies. This is the
highest-value design step -- don't rush it. Define 15-25 semantic
theme tags (wages, termination, contracts, leave, safety,
discrimination, plus jurisdiction-specific themes like emiratisation or
sharia_compliance). Define cross-reference types (start with
Sentencia's 7, add interprets, overrides,
implements as needed). Define computed value types. Every
article should map to at least one theme.
Step 4: Parse and Ingest Laws. Run source documents through a parser (LlamaParse for PDF, DOM parsing for HTML), extract the hierarchy (titles, chapters, articles), run semantic tagging via LLM, extract cross-references via regex + LLM, and load into Supabase. Expect $5-20 per law for initial parsing -- a one-time cost that pays back immediately.
Step 5: Build the Cross-Reference Graph. Extract
explicit references ("Article X", "pursuant to Section
Y"), detect implicit thematic connections via LLM, link inter-law
references, and connect case law to statutes. Type every edge. Enforce
acyclicity on amends and
overrides relationships.
Step 6: Generate SDK and MCP Tools. Expose 7 tools: search articles, get article, get cross-references, get case law, compute value, get law timeline, compare jurisdictions. Use the TypeScript MCP SDK. Each tool queries Supabase and returns structured JSON.
Step 7: Connect to Messaging. User sends a question via WhatsApp. LLM receives the message plus 7 tool definitions. LLM calls 1-3 tools to retrieve structured data. LLM synthesizes a response grounded in ontology data. Response sent with source citations. Total cost: $0.02.
That's it. No magic. Just modeling the domain correctly.
Here's what we're doing at HAQQ in the next 90 days.
Connecting to Dynamic Interfaces' MCP. We're exploring integration with their platform for ontology definition and SDK generation. Building the infrastructure layer from scratch is expensive and slow -- I know because I've been doing it. Using a platform that generates database tables, MCP actions, and typed SDKs from an ontology definition could cut months off our timeline.
Building a UAE Labor Law Ontology. Federal Decree-Law No. 33 of 2021 is our first target. We're mapping its 65 articles into the universal entity model, defining Arabic-English bilingual entities, building EOSG and leave calculators as computed value tables, and extracting cross-references between the law and its implementing regulations.
Testing with Real Legal Workflows. We're working with practitioners in the UAE to validate the ontology against actual legal questions -- end-of-service calculations, termination procedures, Emiratisation compliance. The goal isn't theoretical correctness. It's practical utility: does the ontology produce answers that a lawyer would sign off on? I don't know yet. That's the honest answer. But the architecture gives us the right foundation to find out.
Open-Sourcing the Universal Entity Model. The cross-jurisdiction entity mapping and the universal legal ontology pattern should not be proprietary. We plan to publish the base schema, the cross-reference type taxonomy, and the computed value type system as an open standard. The value is in the jurisdiction-specific data, not the framework.
I went into this expecting a cool demo and some ideas for the backlog. What I got was the architectural pattern I think will define the next wave of legal AI -- not just for MENA, but globally.
The insight is almost embarrassingly simple: model law as what it actually is. A structured, versioned, cross-referenced knowledge system. Stop treating it like a pile of PDFs to search through.
Every major legal AI company -- Harvey at $11B, CoCounsel with a million users, Legora at $5.5B -- is built on RAG. Stanford proved they hallucinate 17-33% of the time. Meanwhile, in medicine, ontology-grounded systems hit 98% accuracy. The architecture exists. The academic validation exists. The market exists.
We've been building legal AI wrong. Not morally wrong -- architecturally wrong. And now I can see the better path.
The ontology is the secret. We're building ours.
Since I started talking about this publicly, the same questions keep coming up. Here are the honest answers.
A legal ontology is a structured, machine-readable model of a legal domain. Think of it as a knowledge graph specifically designed for law -- it defines entities (laws, articles, courts, computed values like wage tables), maps their hierarchical relationships, tracks typed cross-references between provisions, and handles temporal versioning for reforms. The key difference from RAG-based systems: instead of treating law as flat text to search through, a legal ontology captures the actual structure that legislation already has. Hierarchy, versioning, exceptions, formula chains -- all of it, explicitly modeled.
RAG chunks legal documents and retrieves semantically similar passages. An ontology models law as a connected knowledge graph with typed relationships. In practice, this means RAG loses three things that matter enormously in legal reasoning: hierarchy (is this article still inside the chapter it references?), temporal versioning (is this the pre-reform or post-reform version?), and cross-reference chains (does Article 50 depend on Article 84 which depends on a UMA definition?). An ontology preserves all three. The result is deterministic legal reasoning instead of probabilistic retrieval. That's not a subtle distinction -- it's the difference between "probably right" and "cite-ably right."
Harvey is a $11B company charging ~$1,200/lawyer/month (list price) with a 20-seat minimum. They use fine-tuned LLMs with RAG over legal databases -- no formal ontology, no structured knowledge graph, no typed cross-references. They claim 91% accuracy on their own BigLaw Bench, which still means 9% error rate. Sentencia's ontology approach costs $0.02/message with deterministic calculations that don't hallucinate numbers. The architectures are fundamentally different: Harvey makes the model smarter; the ontology approach makes the model's job simpler. Both have a place, but for structured legal calculations -- wrongful dismissal, end-of-service gratuity, compliance thresholds -- the ontology produces verifiably correct results where RAG produces probabilistically approximate ones.
It's not just a problem -- it's quantified. Stanford researchers (Magesh et al., published in the Journal of Empirical Legal Studies, 2025) ran the first preregistered empirical evaluation of production legal RAG tools. They found hallucination rates of 17-33% across Westlaw AI-Assisted Research, Lexis+ AI, and Ask Practical Law AI. These aren't prototype tools -- these are the products lawyers are paying thousands of dollars per month to use. The researchers concluded that hallucinations remain "substantial, wide-ranging, and potentially insidious" and that claims of "hallucination-free" are demonstrably overstated. Meanwhile, in clinical medicine, ontology-grounded GraphRAG achieved 98% accuracy versus ChatGPT-4's 37% -- published in the Journal of Biomedical Informatics. The evidence is clear: structured knowledge representation dramatically reduces hallucination.
In the Sentencia case study, per-message costs dropped from $0.60 to $0.02 -- that's 97%. The savings come from two places: collapsing 300 MCP tool descriptions into 7 (which alone saves ~87,900 tokens per request) and enabling the switch from frontier models to open-source ones. When the model only needs to pick from 7 well-defined tools instead of 300, you don't need Claude Opus. A smaller, cheaper model handles it fine.
MCP is Anthropic's open standard for connecting AI models to external tools and data sources, now governed by the Linux Foundation with support from OpenAI, Google, Microsoft, and AWS. As of 2026, there are 10,000+ active public MCP servers and 97 million monthly SDK downloads. In the context of legal AI, MCP tools expose ontology operations -- search articles, compute settlements, traverse cross-references -- as callable functions. Any MCP-compatible client can invoke them. The reason it matters: it means you can build your ontology once and make it accessible from any AI interface. WhatsApp, Slack, a web app, whatever. The ontology is the backend; MCP is the API layer.
Yes, but with an adaptation layer. Civil law systems (Mexico, France,
Brazil, UAE, Saudi Arabia) map directly because they share codified,
hierarchical structures. Common law systems need to elevate case law to
first-class status alongside statutes -- you add an
interprets cross-reference type for case-to-statute links
and an overrides type for hierarchy conflicts. It's
more work, but the fundamental pattern holds. The five structural
primitives (hierarchy, categorization, cross-references, versioning,
computed values) exist in every legal system.
For a focused domain like a single country's labor law: 4-8 weeks for a small corpus (65-245 articles). That includes source parsing, entity modeling, cross-reference extraction, and MCP tool generation. Parsing costs run $5-20 per law using tools like LlamaParse. The universal entity model template we're building accelerates subsequent jurisdictions -- once you've done it for UAE labor law, Saudi labor law goes faster because the framework is already there.
It's an approach where you define the domain model (the ontology) first, and then all infrastructure -- database tables, API endpoints, AI tool definitions, typed SDKs -- gets generated from that definition. Dynamic Interfaces built their entire platform around this idea. The benefit: type-level consistency across the whole stack. When you change an entity in the ontology, the database schema, the API, and the MCP tools all update in sync. It dramatically reduces the number of tools an AI agent needs because the ontology encodes the relationships that would otherwise require dozens of separate database queries.
Stephane Boghossian is the CEO and co-founder of HAQQ Legal AI, building AI-powered legal tools for the MENA region. A serial founder with a focus on open-source models, Stephane splits his time between Paris and Dubai and writes about legal technology, AI architecture, and building in emerging markets. Connect with him on LinkedIn or reach out via HAQQ's WhatsApp.
Academic Papers:
Legal Sources (Mexico):
Industry: