Meridian Open Banking Lab Formal Models of the Agentic Knowledge Graph · v2.0
Tab 00 · The Substrate

The open-banking knowledge graph problem

Three banks. One ecosystem. Customers fragmented across systems. Third-party providers consuming via consented APIs. Every formal model below addresses one slice of this — the agentic graph composes them all.

Why this substrate matters

Open banking forces the bank's catalog to externalise. A regulated third-party provider (TPP) accesses customer data under explicit consent, for a stated purpose, under a defined retention. The agent doing the access must identify the customer correctly (Fellegi-Sunter), understand the domain semantically (RIGOR / OWL), map the source schemas (R2RML / RML), validate against contracts (ODCS / SHACL), trace provenance for audit (W3C PROV / Semirings), enforce policy (OPA / Zanzibar), and retrieve over the graph (GraphRAG / SPARQL). Each formal model does one job. The lab demonstrates each, then composes them.

Source Systems
5
3 internal · 2 external
Total Source Rows
12
Distinct Humans
3
Identity Variations
7
Active TPPs
1
BudgetEye · PSD2 AISP
Consents
2
What the lab demonstrates · 10 tabs · 45+ models
01 · Identity
Fellegi-Sunter · Splink · Ditto · Jaro-Winkler · Sorted Neighborhood
Resolve Sarah Chen across all 3 banks with probabilistic linkage + blocking + transformer matching.
02 · Ontology
RIGOR · OWL · RDFS · Description Logic · OntoGPT · Layer Cake
Generate OWL ontology from schemas iteratively. Reason: ISA ⊑ SavingsAccount, illiquid disjoint LiquidAsset.
03 · Mapping
R2RML · RML · OBDA · Ontop
Map relational tables to RDF. Rewrite SPARQL to SQL across 3 banks live.
04 · Schema Matching
Cupid · COMA · Valentine · TaBERT
Match balance ↔ accountBalance ↔ availableBalance across 3 different bank schemas.
05 · Contracts
ODCS · SHACL · ShEx · DCAT · ISO 11179
Validate Sarah's consent payload against SHACL shapes. Fail fast. Enforce semantics.
06 · Provenance
W3C PROV · OPM · Why/Where/How · Semirings · OpenLineage
Trace Marcus's transaction → BCBS report as a polynomial. Replay 7 years later.
07 · Rules & Policy
DMN · SBVR · Datalog · OPA · Cedar · RBAC · ABAC · ReBAC
"Can BudgetEye read Sarah's balance?" Eight policy models · one decision · one audit row.
08 · Query & GraphRAG
SPARQL · Cypher · GQL · GraphRAG · TransE · GraphSAGE
Same question, three query languages. Then watch GraphRAG ground an LLM answer.
09 · Unified KG
All models composed
Everything together. Click any node. Toggle layers. The graph carries the contract.
The thesis. Modern catalogs and open-banking platforms implement these models — often without naming them. By naming them explicitly and demonstrating each on the same substrate, this lab makes the academic spine of an agentic open-banking knowledge graph visible.
Tab 01 · Identity Resolution

Resolving Sarah Chen across three banks

Probabilistic record linkage (Fellegi-Sunter 1969) plus blocking, string similarity, and modern transformer matching — composed on real rows.

Fellegi-Sunter 1969
Jaro-Winkler 1989
Levenshtein 1965
Sorted Neighborhood 1995
Splink 2020
Ditto 2020
DeepMatcher 2018
Magellan / ZeroER
Fellegi-Sunter (1969) Theory for Record Linkage · JASA Foundational

For each comparable field, compute m (P[agree | match]) and u (P[agree | non-match]). Sum log2(m/u) → match weight. Threshold to MATCH / POSSIBLE / NON-MATCH.

CANDIDATES UNDER COMPARISON:
CORE.C001 CARDS.CD-0451 WEALTH.WM-7821
String similarity Jaro-Winkler 1989 · Levenshtein 1965

The actual functions that feed FS comparison vectors. Compare names across banks:

ABJWLev
Sarah ChenChen Sarah0.9410
Sarah ChenS. Chen0.825
Sarah ChenMarcus Aldridge0.4114
Sorted Neighborhood Hernandez-Stolfo 1995

Reduces O(n²) comparisons by sorting on a key (Soundex / dob-year) then comparing only within a sliding window. Makes cross-bank ER tractable.

// Across 3 banks: 7 rows → C(7,2) = 21 comparisons // With blocking on dob-year: window(1989): [CORE.C001, CARDS.CD-0451, WEALTH.WM-7821] window(1972): [CORE.C002, CARDS.CD-0892, WEALTH.WM-7944] window(1995): [CORE.C003] // 3+3+0 = 6 pair-comparisons instead of 21 (71% reduction)
Modern stack · scale + neural Splink 2020 · Ditto 2020 · DeepMatcher 2018 · Magellan / ZeroER Production

Splink applies FS at scale with expectation-maximisation (no labels needed). Ditto / DeepMatcher fine-tune BERT for transformer-based matching. Magellan / ZeroER add zero-shot capability. All output a confidence score; the catalog reconciles them.

Classical FS
Splink (EM)
Ditto (BERT)
ZeroER
Consensus
Tab 02 · Ontology & Semantics

From schemas to OWL — and reasoning over it

RIGOR generates the ontology iteratively. OWL 2 DL gives the formal logic. A Description Logic reasoner draws inferences. OntoGPT extracts from text. The layer cake puts it all in order.

RIGOR 2025
OWL / OWL 2 (W3C)
RDFS (W3C 2004)
Description Logic
OntoGPT 2023
Layer Cake (Cimiano-Mädche)
FIBO / BIAN / ISO 20022
RIGOR (Nayyeri et al., 2025) Retrieval-Augmented Iterative Generation of RDB Ontologies Frontier

For each table: retrieve schema + domain ontology (FIBO/BIAN) + growing core ontology → Gen-LLM produces delta-ontology fragment → Judge-LLM validates → merge. Iterate following foreign-key constraints.

RIGOR Pipeline · Live
Click ▶ Generate to start the iterative process…
Growing OWL Ontology · Core
Description Logic Reasoner SROIQ · the formal logic behind OWL 2 DL Reasoning

Once the ontology exists, a DL reasoner draws inferences. Premise: ISA ⊑ SavingsAccount, SavingsAccount ⊑ Account ⊓ ∃ hasWithdrawalPenalty.Penalty, hasWithdrawalPenalty Disjoint LiquidAsset. Inference: ISA accounts are not liquid assets.

OntoGPT 2023

Extract structured ontology-aligned knowledge from unstructured text. Example: a regulatory policy doc → ontology candidates.

Input doc: "All ISAs must have a 90-day withdrawal restriction or pay a penalty of 5% of withdrawn amount." OntoGPT extraction: Concept: ISA Property: hasWithdrawalRestriction range: Duration ≥ 90 days Property: hasEarlyWithdrawalPenalty range: Percentage = 5%
Ontology Learning Layer Cake Cimiano & Mädche 2005

The canonical 7-layer pipeline that every ontology-learning method instantiates:

▲ Axioms (rules)
│ Relations
│ Concept hierarchies
│ Concepts
│ Synonyms
│ Terms
└─ Raw text / data
RDFS · the lightweight base W3C 2004 · classes, subclasses, domain, range

Before full OWL reasoning, RDFS provides the semantic backbone — classes, sub-class relations, property domains and ranges. Every richer ontology compiles down to RDFS triples for portability.

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix ob: <http://meridian.bank/ontology/> . ob:Customer rdfs:subClassOf ob:LegalPerson . ob:WealthClient rdfs:subClassOf ob:Customer . ob:hasAccount rdfs:domain ob:Customer ; rdfs:range ob:Account .
Tab 03 · Schema → Graph Mapping

From relational tables to a virtual knowledge graph

R2RML and RML declare the mapping from tabular data to RDF. OBDA / Ontop rewrite SPARQL queries into SQL against the live sources — no data is moved. Watch a SPARQL query traverse three banks live.

R2RML (W3C 2012)
RML (Ghent · 2014 · 2024 spec)
OBDA · Calvanese 2007+
Ontop
RDF (W3C 1999)
R2RML mapping fragment W3C 2012 · the standard relational→RDF mapping language Standard

Declarative TriplesMap: a logical source (SQL view), a subject template, predicate-object maps. RIGOR generates these automatically.

@prefix rr: <http://www.w3.org/ns/r2rml#> . @prefix fibo: <https://spec.edmcouncil.org/fibo/> . @prefix ob: <http://meridian.bank/ontology/> . <#CustomerMap> rr:logicalTable [ rr:tableName "MERIDIAN.core_banking" ] ; rr:subjectMap [ rr:template "http://meridian.bank/customer/{customer_id}" ; rr:class fibo:Customer ; ] ; rr:predicateObjectMap [ rr:predicate ob:hasFullName ; rr:objectMap [ rr:column "name" ] ] ; rr:predicateObjectMap [ rr:predicate ob:hasKycStatus ; rr:objectMap [ rr:column "kyc" ] ] .
RML · the JSON/CSV/API extension Ghent IDLab · 2014 · v2 spec 2024

Open banking sources are API/JSON-heavy. RML adds a rml:source with format/iterator/reference so the same mapping grammar applies to PSD2 JSON responses.

@prefix rml: <http://semweb.mmlab.be/ns/rml#> . @prefix ql: <http://semweb.mmlab.be/ns/ql#> . <#ConsentMap> rml:logicalSource [ rml:source "/ob/v3.1/aisp/consents.json" ; rml:referenceFormulation ql:JSONPath ; rml:iterator "$.Data.Consent[*]" ] ; rr:subjectMap [ rml:template "http://meridian.bank/consent/{ConsentId}" ; rr:class ob:Consent ] .
OBDA · Virtual Knowledge Graph Calvanese et al. · Ontop · the bank's data stays put Live

The agent writes SPARQL. Ontop rewrites it to SQL across all three banks. Data is never physically moved. Click below to watch a real query flow.

Agent's SPARQL query
PREFIX ob: <http://meridian.bank/ontology/> PREFIX fibo: <https://spec.edmcouncil.org/fibo/> SELECT ?customer ?name ?aum ?cardTier WHERE { ?customer a ob:WealthClient . ?customer ob:hasFullName ?name . ?customer ob:hasAum ?aum . ?customer ob:holdsCard ?card . ?card ob:hasCardTier ?cardTier . FILTER(?aum > 1000000) }
Ontop rewriter output
Click ▶ Rewrite to watch SPARQL become federated SQL…
Tab 04 · Schema Matching

When three banks call the same thing three different names

balance · accountBalance · availableBalance — same concept, three labels. Cupid uses linguistic + structural similarity. COMA composes multiple matchers. Valentine benchmarks them. TaBERT does it with transformers.

Cupid · Madhavan-Bernstein-Rahm 2001
COMA / COMA++ 2002-2005
Valentine · TU Delft 2021
TaBERT / TURL / Tabbie 2020-21
LogMap (Oxford)
Three bank schemas · one shared concept Interactive

Each bank labels the customer's spendable funds differently. The matcher must align them to the canonical concept ob:availableBalance.

MERIDIAN.core_banking
field: balance
int(13,2) · GBP · updated daily 02:00 UTC
MERIDIAN.cards
field: avail_credit
decimal · credit limit minus outstanding
MERIDIAN.wealth
field: cashBalanceGBP
numeric · settled cash · excludes pending
Why composite matching · COMA Aumueller-Do-Massmann-Rahm 2002-2005

No single matcher works in isolation. COMA runs linguistic (Levenshtein on labels), structural (graph topology of foreign keys), type (data-type compatibility), and instance (sample value overlap) matchers in parallel and combines with learned weights. The composite score is what the catalog records.

Valentine benchmark · is the match good? TU Delft 2021 · the matcher's report card

Valentine provides labeled benchmarks (TPC-DI, Magellan, OpenData) so you can measure precision/recall/F1 of any matcher. Without it, agentic schema matching is unverifiable. With it, the catalog can record: "Cupid on this corpus: F1 = 0.82. TaBERT: F1 = 0.91. We trust TaBERT above threshold 0.85 with steward review below."

Tab 05 · Data Contracts & Semantic Validation

Sarah's consent enters BudgetEye — is it valid?

ODCS captures the contract metadata (schema, SLA, ownership). SHACL validates the actual RDF graph against required shapes. ShEx is the compact alternative. DCAT publishes data products. ISO 11179 registers data elements. Together: enforceable semantic contracts.

ODCS · Open Data Contract Standard
SHACL (W3C 2017)
ShEx 2014+
DCAT v3 (W3C 2024)
Dublin Core
ISO/IEC 11179
ODCS contract for Sarah's consent Open Data Contract Standard · YAML Contract

The TPP cannot consume the data product without an explicit contract. The contract names the schema, SLA, ownership, quality expectations, and access conditions.

version: 1.0.0 kind: DataContract apiVersion: v3.0.1 id: ob.consent.aisp.v1 info: title: Open Banking AISP Consent owner: meridian.bank/data-office jurisdiction: UK regulation: [PSD2, UK_OB_v3.1, FAPI2] schema: - name: consent physicalName: OB_API.consents properties: - name: consent_id primary: true required: true - name: customer_id required: true fk: Customer - name: tpp_id required: true fk: TPP - name: scope required: true cardinality: 1..n - name: expires_at required: true type: timestamp - name: status required: true enum: [ACTIVE, REVOKED, EXPIRED] quality: - rule: "status==ACTIVE implies expires_at > now()" - rule: "every consent_id is unique" - rule: "scope MUST be in [accounts:read, balances:read, transactions:read]" sla: availability: 99.95% latency_p99_ms: 250 retention_days: 2557 # 7 years access: classification: CONFIDENTIAL_PII policy: ob.policy.consent.tpp_access.v3
SHACL · validating the graph W3C 2017 · shapes constraint language Validator

The contract is the spec. SHACL is the runtime enforcement. Every consent flowing into the KG passes the validator first. Reject early.

SHACL Shape · ob:ConsentShape
@prefix sh: <http://www.w3.org/ns/shacl#> . @prefix ob: <http://meridian.bank/ontology/> . ob:ConsentShape a sh:NodeShape ; sh:targetClass ob:Consent ; sh:property [ sh:path ob:authorisedBy ; sh:class ob:Customer ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:path ob:hasScope ; sh:in ("accounts:read" "balances:read" "transactions:read") ; sh:minCount 1 ] ; sh:property [ sh:path ob:hasExpiry ; sh:datatype xsd:dateTime ; sh:minCount 1 ] .
Validation report · live
Click ▶ Validate to run SHACL against three test consents…
ShEx · the compact alternative 2014+

Shape Expressions — same job as SHACL, more compact syntax, easier to author by hand. Some practitioners pair the two: ShEx for definition, SHACL for execution.

PREFIX ob: <http://meridian.bank/ontology/> ob:ConsentShape { ob:authorisedBy @<CustomerShape> ; ob:grantedTo @<TPPShape> ; ob:hasScope [ "accounts:read" "balances:read" "transactions:read" ]+ ; ob:hasExpiry xsd:dateTime ; ob:hasStatus [ "ACTIVE" "REVOKED" "EXPIRED" ] }
DCAT · publishing the data product W3C · v3 2024

Once the contract is valid, the data product is published via DCAT. Dataset, Distribution, DataService nodes. Combined with Dublin Core for metadata (title, creator, license).

@prefix dcat: <http://www.w3.org/ns/dcat#> . @prefix dct: <http://purl.org/dc/terms/> . :obConsent_v3 a dcat:Dataset ; dct:title "Open Banking AISP Consents" ; dct:publisher :meridian ; dct:license :fapi2_license ; dcat:distribution :obConsent_v3_json ; dcat:contactPoint :dataoffice ; dct:conformsTo :ob.consent.aisp.v1 .
ISO/IEC 11179 · the registry of data elements 1990s+ · the discipline behind any glossary

Every data element in the contract — consent_id, customer_id, scope — is registered with a definition, name, identifier, value domain, and steward. ISO 11179 is what makes a bank's glossary defensible to a regulator.

Tab 06 · Provenance & Lineage

Marcus's transaction → BCBS report — five formal models

The Open Provenance Model gave us the basic vocabulary. W3C PROV made it a standard. Provenance Semirings gave us the algebra of trust. Why/Where/How gave us three formal questions to ask. OpenLineage made it operational.

W3C PROV-DM (2013)
OPM · Moreau 2008
Why-Prov · Cui-Widom 2000
Where-Prov · Buneman 2001
How-Prov · Green 2007
Provenance Semirings · PODS 2007
OpenLineage 2020
Provenance Semirings (PODS 2007) Green · Karvounarakis · Tannen · University of Pennsylvania Algebraic

Annotate each source tuple with a variable. Join (⊗) multiplies. Union (⊕) adds. The result is a polynomial — evaluate it in different semirings to get why-provenance, trust, confidence, multiplicity.

Why · Where · How — three formal questions Cui-Widom 2000 · Buneman-Khanna-Tan 2001 · Green-Tannen 2007

Every derived fact in the catalog can be queried three ways. Why: which source tuples contributed? Where: which source location did each value come from? How: which derivation steps were applied?

WHY-PROV (witness sets) > Which tuples produced the BCBS line? { txn(T-99041), pos(P-7741), risk(R-CBS-EU), cust(C002) }
WHERE-PROV (locations) > Where did "EUR Bund" come from? TXN_DB.t-99041.instrument @ ingested 2026-03-12T08:14Z
HOW-PROV (derivation) > How was RWA 625K computed? compose( join(txn, position, on=client_id, instrument), apply(risk_model.lookup(instrument_type)), aggregate(group=counterparty_type, op=sum) )
W3C PROV-DM · the standard representation 2013 · Entity · Activity · Agent Standard

Provenance semirings provide the algebra. W3C PROV provides the interchange format. Every modern lineage tool (OpenLineage, Marquez, Atlas, Egeria) emits PROV-compatible events.

OPM & OpenLineage · operational Moreau 2008 → W3C 2013 → LFAI 2020

OPM (2008) introduced Artifact-Process-Agent as the three node types. W3C PROV refined this to Entity-Activity-Agent. OpenLineage (2020) made it operational — event-driven emission from every pipeline run. The catalog projects the events into a graph.

// OpenLineage event emitted when risk_aggregate job runs { "eventType": "COMPLETE", "eventTime": "2026-04-30T03:00:14Z", "run": { "runId": "a8f3..." }, "job": { "namespace": "meridian.risk", "name": "risk_aggregate" }, "inputs": [ { "namespace": "meridian.txn", "name": "transactions" }, { "namespace": "meridian.pos", "name": "positions" }, { "namespace": "meridian.risk", "name": "rwa_model" } ], "outputs": [ { "namespace": "meridian.report", "name": "bcbs_239_exposure" } ] }
Tab 07 · Rules & Policy

Can BudgetEye read Sarah's balance?

One question. Eight formal models converge on one decision. Business rules from DMN/SBVR/Datalog. Authorisation from RBAC/ABAC/ReBAC. Policy-as-code in OPA/Rego or Cedar. Watch them compose.

DMN 2015
SBVR 2008
Datalog 1977+
RuleML
RBAC 1996
ABAC · NIST 800-162
ReBAC · Zanzibar 2019
OPA / Rego 2018
AWS Cedar 2023
The decision · live Interactive

Configure the request and watch every model produce its verdict. The catalog records the union of all eight decisions plus the final composed allow/deny.

SUBJECT
ACTION · scope
OBJECT
DMN · Decision Model and Notation OMG 2015

Business decisions as tables. Auditable. Owned by business, not engineering. Compiles to executable.

Decision Table · TPP_AccessEligibility ┌─────────────────┬──────────┬──────────┐ │ TPP.regulated │ TPP.eIDAS│ Result │ ├─────────────────┼──────────┼──────────┤ │ true │ valid │ ELIGIBLE │ │ true │ expired │ REFUSE │ │ false │ – │ REFUSE │ └─────────────────┴──────────┴──────────┘
SBVR · Semantics of Business Vocabulary and Rules OMG 2008

Business rules in structured natural language. Bridges business and machine.

// Rule R-OB-001 · expressed in SBVR-SE It is obligatory that each consent has an expiry date after its creation date. // Rule R-OB-002 It is prohibited that a TPP accesses an account without an active consent.
The authorisation triad · RBAC / ABAC / ReBAC 1996 · 2014 · 2019

Three philosophies of access decisions, each strongest in a different layer of the open-banking graph.

RBAC · roles
Sandhu et al. 1996
"Anyone with role TPP-AISP can call /balances endpoints." Coarse but auditable.
ABAC · attributes
NIST 800-162 (2014)
"Subject.regulator=FCA AND Resource.classification=PII AND Env.jurisdiction=UK." Composable but verbose.
ReBAC · relationships
Zanzibar 2019
"BudgetEye granted Sarah's consent → traverse graph edge." Native to KGs. The agentic answer.
OPA / Rego · policy-as-code Styra 2018 · Datalog-derived
package ob.tpp.balances default allow := false allow if { input.subject.type == "TPP" input.subject.regulated == true input.action == "balances:read" consent := data.consents[_] consent.customer == input.object.owner consent.tpp == input.subject.id "balances:read" in consent.scope consent.status == "ACTIVE" time.parse_rfc3339_ns(consent.expiry) > time.now_ns() }
AWS Cedar · formally verified AWS 2023 · provable termination
permit ( principal in Role::"TPP_AISP", action == Action::"ReadBalance", resource ) when { resource.owner has consent && resource.owner.consent.tpp == principal && resource.owner.consent.status == "ACTIVE" && resource.owner.consent.scope.contains("balances:read") };
Datalog · the recursion engine 1977+ · the ancestor of Rego & SpiceDB

Both OPA Rego and Zanzibar's relationship resolver compile to Datalog or its evaluation. The reason: Datalog has natural recursion (closed under fixpoint) — exactly what you need to answer "is X reachable from Y via authorised edges?" in a graph.

// Datalog rules for transitive access can_read(S, O) :- consent(C, O, S, Scope), active(C), member("balances:read", Scope). can_read(S, O) :- delegated(S, S2), can_read(S2, O). ?- can_read("BudgetEye", "Sarah"). ⊢ yes
Tab 08 · Query · Reasoning · Retrieval

Same question. Three query languages. Then watch GraphRAG ground an LLM.

SPARQL is W3C's standard for RDF. Cypher rose with Neo4j. ISO GQL standardised the property-graph world in 2024. Then GraphRAG retrieves over the graph for the LLM. KG embeddings (TransE, GraphSAGE) predict missing links and similar customers.

SPARQL 1.1 (W3C 2013)
Cypher
ISO GQL 2024
GraphRAG · MSR 2024
TransE 2013
DistMult / ComplEx / RotatE
Node2Vec · DeepWalk 2014-16
GraphSAGE · GAT · R-GCN 2017-19
Probabilistic Databases
Same question · three languages Standards

"Find all customers who hold a card AND have a wealth portfolio above £1M, with their card tier." One question, expressed three ways:

SPARQL · W3C 2013
SELECT ?c ?name ?aum ?tier WHERE { ?c a ob:Cardholder, ob:WealthClient . ?c ob:hasFullName ?name . ?c ob:hasAum ?aum . ?c ob:holdsCard ?card . ?card ob:hasCardTier ?tier . FILTER(?aum > 1000000) }
Cypher · Neo4j
MATCH (c:Customer) WHERE c:Cardholder AND c:WealthClient AND c.aum > 1000000 MATCH (c)-[:HOLDS_CARD]->(card) RETURN c.fullName AS name, c.aum AS aum, card.tier AS tier
ISO GQL · 2024
MATCH (c:Customer&Cardholder &WealthClient {aum > 1000000}) -[:holds_card]-> (card:Card) RETURN c.full_name AS name, c.aum, card.tier
The standards story. SPARQL won the RDF/semantic world. Cypher won the property-graph world. ISO GQL (2024) is the convergence — same conceptual model, vendor-neutral. An agentic open-banking platform should emit all three from the same query abstraction, so the catalog can talk to whichever store the bank operates.
GraphRAG · ground the LLM with the graph Microsoft Research 2024 Live

The LLM alone hallucinates. RAG retrieves chunks. GraphRAG retrieves subgraphs — multi-hop, community-aware, with cited edges. Watch a banking question grounded through the KG.

AGENT QUESTION
"Show me all our PLATINUM cardholders who are also Wealth clients with AUM > £1M and have an active TPP consent. What's their typical risk profile?"
KG embeddings · link prediction & similarity TransE 2013 · DistMult · ComplEx · RotatE · GraphSAGE 2017 ML

Embed every entity and relation as a vector. TransE uses h + r ≈ t (head + relation = tail). GraphSAGE samples neighbour aggregations. The catalog can then predict: missing edges, similar customers, suspicious graph patterns (fraud).

TransE example · predicting Sarah's likely TPPs
// Trained embeddings (illustrative) e(Sarah) = [0.32, -0.18, 0.71, ...] e(consents) = [0.05, 0.22, 0.11, ...] // Predict: who is Sarah likely to consent to? candidate = argmintpp ‖ e(Sarah) + e(consents) - e(tpp) ‖
Candidate TPPDistanceVerdict
BudgetEye0.12already consented
MoneyHub0.18likely candidate
Plaid UK0.21possible
FraudCo0.94very unlikely (anomaly?)
GraphSAGE · customer similarity for upsell
// Aggregate neighbour signal hSarah(k+1) = σ( W · [hSarah(k); MEAN({hn(k) ∀ n ∈ N(Sarah)})] ) // Find similar customers top5 = argmaxc cos(hSarah, hc)

The bank's marketing agent can now propose: "customers like Sarah upgraded to PLATINUM after 18 months" with cosine similarity as evidence — anchored in the KG, not in a black-box model.

Probabilistic Databases & Knowledge Vault Suciu et al. · Google 2014

Not every fact in the KG is certain. External-bank data, merchant tags, NER-extracted entities come with confidence scores. Probabilistic databases represent every tuple with a probability. Knowledge Vault (Google 2014) was the canonical example: 1.6B facts, each with a confidence.

// Each KG triple carries a probability (Sarah, employed_by, Acme_Ltd) p = 0.92 // from KYC docs (Sarah, employed_by, Beta_Consulting) p = 0.41 // from a salary credit (Sarah, spends_at, GreenGrocer) p = 0.98 // from card txn
Tab 09 · The Synthesis

The Unified Open Banking Knowledge Graph

Entities resolved (Fellegi-Sunter). Ontology applied (RIGOR + OWL + DL). Mapped from schemas (R2RML/RML). Validated against contracts (SHACL). Provenance attached (PROV + Semirings). Policy enforced (OPA + Zanzibar). Queried via SPARQL/Cypher/GQL. Retrieved via GraphRAG. The graph below is the composition of every model above, on Meridian's real rows.

The composed graph · all 45+ models, one canvas Live · Interactive

Toggle the layers to see how each formal model contributes a different facet. Click any node to see its provenance and attestations.

Resolved Entities
Ontology Classes
Provenance & Sources
Consents & TPPs
Policy & Access
Click any node to inspect its full attestation chain…
What this graph encodes
Source Rows
12
Resolved Entities
3
Ontology Axioms
17
Active Consents
2
PROV Edges
19
Policy Rules
8
Formal Models
45+
Years of Research
56
since Fellegi-Sunter

Three real customers. One regulated TPP. Forty-five formal models. Every claim in the graph is attributable to a source row, defended by an ontology axiom, validated by a contract, traceable through PROV, governed by a policy, and queryable through three standards. That is the academic contract of an agentic open-banking knowledge graph.

The final thesis. Modern open-banking catalogs (Actian, Atlan, Egeria, Glean, Ontop, AuthZed, Open Policy Agent, RIGOR-style generators) implement these academic models — often without naming them. Naming them explicitly, demonstrating each on the same substrate, and composing them in one interactive graph is what makes this lab faithful — to the research, to the regulator, and to the agent that has to act.