OntoBoom Docs
OntoBoom turns your domain model into an agent contract: a versioned bundle of OWL/RDF ontology + database mapping + AI tool schema + SHACL validation + SHA-256 manifest. The three stages — design → publish → connect — match the layout below.
Quickstart
- Sign up at ontoboom.com/register. Free plan: 1 project, 1 ontology, 5 AI Copilot credits.
- Create a project. Either fork a curated pack from hub.ontoboom.com/@ontoboom (crm, e-commerce, fintech, b2b-saas, healthcare-lite, customer-360) or start with your own database.
- Capture your schema — Database tab → add connection → snapshot.
- Map schema to ontology — Mapping tab → auto- suggest + manual fixes.
- Export the OPS package — Export tab → ZIP with ontology.ttl + mapping.json + tool-schema.json + shapes.ttl + ops.json.
- Publish to Hub (optional, but unlocks the live MCP endpoint).
- Wire to your agent using one of the framework sections below.
1 · Design
Designing the ontology
The Studio editor at app.ontoboom.com is where the ontology actually gets built. You can start from scratch, fork a curated @ontoboom pack from the Hub, or reverse-engineer from a database snapshot — pick whichever matches your team's starting point.
The canvas
React-Flow-based visual graph editor. Drag elements from the left palette onto the canvas, drag from a class's edge to create a relationship. Selecting a node opens the Inspector panel where you set the IRI, label, comment, SHACL constraints, and (for object properties) characteristics like functional, inverseFunctional, transitive, symmetric.
Five element types live in the palette:
- Class — an entity type (e.g. Customer, Account, Order). Has an IRI, optional superclasses, optional
disjointWith/equivalentClassessets. - Object property — a typed relationship between two classes. Drawn as a directed edge with a label. Domain + range are inferred from the two ends but can be overridden.
- Data property — a typed attribute on a class (string, decimal, date, etc.). Edges are not drawn for these; they live in the class's inspector.
- Individual — a named instance of a class (e.g.
USDas an instance ofCurrency). Useful when an enumeration is small and stable. - SHACL Constraint — a NodeShape targeting a class with one or more PropertyShapes.
AI Copilot
Chat panel on the right of the canvas. Powered by gpt-4o-mini with structured tool calls — it doesn't just suggest text, it actually executes operations against your ontology JSON (add class, rename property, fix validation error, generate SHACL constraints). Useful prompts:
- "Add a Person class with name and email properties."
- "Create a relationship: Order placedBy Customer."
- "Fix the validation errors below." — pair this with the validation panel which auto-feeds error context into the prompt.
- "Add SHACL constraints to enforce email format on Person."
Each Copilot turn consumes 1 AI Credit (Free includes 5/mo; Pro and Teams include 5/mo with the option to purchase more from the profile page).
Subject areas
Visual grouping for large ontologies. Each class can belong to zero or more subject areas, each rendered as a colored background region on the canvas. Use them to keep a 50-class ontology legible (customer area, billing area, fulfilment area, etc.).
Imports — composing across the Hub
The Imports panel in the editor lets you declare a semver-pinned dependency on another Hub ontology. Type a Hub-search query, pick a result, set a constraint (e.g. ^1.0), and the imported entities show up as ghost nodes you can use as relationship targets without owning them. The imports list is persisted in the ontology JSON as a top-level imports[] array and travels with the publish — downstream consumers see your declared deps in the Hub UI.
SHACL validation in real time
The bottom Validation panel runs constraint checks continuously as you edit. Errors include the path (which class / property is wrong), a one-line description, and a Fix with Copilot button that pipes the errors into the Copilot prompt. Constraints you can express today:
- Cardinality: minCount, maxCount
- Data type: xsd:string, xsd:integer, xsd:decimal, xsd:date, xsd:dateTime, xsd:boolean, etc.
- Value enumeration:
sh:inwith a list of allowed values - Regex pattern:
sh:pattern(e.g. email, E.164 phone, ISO currency code) - Numeric range: minInclusive / maxInclusive / minExclusive / maxExclusive
- String length: minLength / maxLength
- Object-property class target:
sh:class— restrict an edge to instances of a specific class
Versioning
Every save is a new OntologyVersion row. The version selector in the toolbar lets you switch between versions, compare diagrams, or roll back by selecting an older version and saving it forward. Team projects also support soft locking: when a member opens an ontology, others see "Locked by Alice — read-only" and can request the lock when she's done.
Export formats
Any saved version exports to:
- Turtle (.ttl) — canonical OWL/RDF serialization, what
ontology.ttlin the OPS package contains - JSON-LD (.jsonld) — RDF in JSON syntax, good for web APIs
- RDF/XML (.rdf) — legacy enterprise tools
- OntoBoom JSON (.json) — the native editor format with diagram layout preserved
- Cypher (.cypher) — node + relationship statements for loading into Neo4j or any property-graph DB
When you're happy with the model, the next two sub-sections cover how to attach it to a real database (Database connectors) and what gets bundled into the deployable OPS package.
Database connectors
Three dialects are supported today: Postgres, MySQL/MariaDB, SQL Server. Connection credentials are encrypted at rest with Fernet (key in DB_CREDENTIALS_ENCRYPTION_KEY).
- PostgreSQL — host, port (5432), database, schema, user, password
- MySQL / MariaDB — host, port (3306), database, user, password
- SQL Server — host, port (1433), database, schema, user, password
After connection:
- Capture snapshot — Studio ingests the table names, column types, and foreign keys.
- Reverse-engineer to ontology — generates a starter ontology with classes per table and data properties per column. You then refine in the canvas.
- Auto-map — deterministic name matching plus AI suggestions. Each rule has a confidence score and a one-line reasoning string you can review.
Mapping schema to ontology
Once you have an ontology in step 1 and a database snapshot in step 2, mapping is the bridge: a set of rules that says "column X in table Y means concept Z in the ontology." This is the artifact that makes semantic_query possible — without it, agents would still have to guess column names and joins.
The three rule target types
Each row in mapping.json is one rule. Every rule has the same shape but targets one of three ontology element types:
- class — the whole table maps to a class.
source_columnisnull. Example: tablecustomersmaps to classCustomer. - data_property — a single column maps to a data property (a typed attribute) on a class. Example:
customers.emailmaps to the data propertyemailAddressonCustomer. - object_property — a foreign-key column maps to a typed relationship between two classes. Example:
orders.customer_idmaps to the object propertyplacedByfromOrdertoCustomer.
The Mapping tab — what you actually do
Open the Mapping tab on a project that has (a) at least one ontology and (b) a captured database snapshot.
- Auto-suggest — click the button at the top of the Mapping tab. Studio uses deterministic name matching (table name ≈ class label, column name ≈ property label) and the AI Copilot to propose rules. Each proposal lands in the rule list with a confidence score (0.0–1.0) and a reasoning string explaining the match. You don't have to accept any of them.
- Review + accept — each row in the rule list has Accept / Reject / Edit buttons. Reject removes the suggestion; Accept commits it. Edit opens the rule for adjustment.
- Add manual rules — for joins the auto- matcher misses (denormalised columns, JSON fields, etc.), click New rule, pick the source table / column, pick the ontology target, choose the rule type, save. Manual rules get confidence 1.0 by convention.
- Save as version — every save creates a new
MappingVersionrow. Older versions stay accessible via the version selector at the top of the tab. The latest version is what the Export tab bundles and the Playground tab queries against.
What a rule looks like (mapping.json)
{
"rules": [
{
"source_table": "customers",
"source_column": null,
"target_type": "class",
"target_iri": "https://example.org/sales#Customer",
"target_label": "Customer",
"confidence": 0.95,
"reasoning": "Table 'customers' maps to class Customer by name + plural-singular match."
},
{
"source_table": "customers",
"source_column": "email",
"target_type": "data_property",
"target_iri": "https://example.org/sales#emailAddress",
"target_label": "email address",
"confidence": 0.92,
"reasoning": "Column 'email' matches data property emailAddress by label similarity."
},
{
"source_table": "orders",
"source_column": "customer_id",
"target_type": "object_property",
"target_iri": "https://example.org/sales#placedBy",
"target_label": "placed by",
"confidence": 0.87,
"reasoning": "FK orders.customer_id -> customers.id matches placedBy: Order -> Customer."
}
]
}The source block in the same JSON carries the dialect (postgres / mysql / mssql) and default schema, so the LLM that turns a question into SQL knows exactly how to qualify table names.
Why this matters at query time
When an agent calls semantic_query("customers with active subscriptions"), the rule list is included verbatim in the system prompt to the LLM that generates the SQL. Every concept the question mentions ("customer", "active subscription") maps to a row in the rule list, which maps to a concrete schema.table.column reference. The model doesn't have to invent column names — it just composes joins from the rules it's been handed.
That's the entire premise: well-mapped ontology + well-mapped schema = an agent that queries your data accurately, with the real column names and joins.
Common mapping patterns
- One row, one entity: a table whose primary key identifies a unique instance — map the table to a class, then map each column to a data property.
- Foreign keys: a column ending in
_idthat points at another table — map the column as an object property from the source class to the target class. - Join tables: a table with two FKs and no other data — map both FKs as object properties from the join-table class, OR (more often) model the join table as a many-to-many relationship between the two endpoint classes and skip mapping the join table itself.
- Polymorphic / discriminator columns: a
typecolumn that switches subtype — map the type column tordf:typein the manual rule editor and create one class per discriminator value. - Materialized views: mappable just like tables. Good for denormalising complicated joins so the mapping rule list stays simple.
- JSON columns: Postgres
jsonbkeys can be mapped using a JSON path in the column field (address->>'street'); the LLM will use it verbatim in generated SQL.
The OPS package
What makes OntoBoom an agent tool, not just an ontology editor. Five files, one zip:
ontology.ttl
Your OWL ontology in Turtle. Load it into Protege, Apache Jena, rdflib, or any SPARQL endpoint.
mapping.json
OBS-Mapping v1.0. Each rule maps a database element to an ontology concept:
{
"source": {"dialect": "postgres", "schema": "public"},
"rules": [
{
"source_table": "accounts",
"source_column": null,
"target_type": "class",
"target_iri": "https://example.org/finance#Account",
"target_label": "Account",
"confidence": 0.95,
"reasoning": "Table accounts maps to class Account by name + semantics."
},
{
"source_table": "accounts",
"source_column": "balance",
"target_type": "data_property",
"target_iri": "https://example.org/finance#balanceAvailable",
"target_label": "balance available",
"confidence": 0.88
}
]
}tool-schema.json
OBS-Tool v1.0. A JSON-schema function definition you drop into any framework that follows the convention (OpenAI, Anthropic, LangChain, Bedrock, Vertex, …). Exposes a semantic_query tool whose arguments are ontology-aware filters — not raw SQL.
shapes.ttl
SHACL constraints — cardinality, datatype, value-set enumerations, regex patterns. Validate instance data before ingesting.
ops.json
Manifest with SHA-256 + byte size per file. Use it to verify integrity after transfer or to pin a specific package version in your deploy pipeline.
Playground (preview)
The Playground tab is a live preview of what an agent will experience when it calls semantic_query against this project. Use it to sanity-check your mapping before you ship the OPS package or publish to Hub.
- Type a natural-language question into the input box at the top of the Playground tab — e.g. "customers with active subscriptions in California created in the last 30 days".
- Studio sends the question + the current mapping rules + a small system prompt to gpt-4o-mini, which produces a SQL
SELECT(alwaysLIMIT 100, never a mutating statement). - Studio executes the SQL against the connected database from step 2 and renders the result — the generated SQL on top, the column headers and row data underneath.
- Iterate — if the SQL is wrong (referenced an unmapped column, picked the wrong join), the fix is almost always to add or correct a mapping rule, not to re-prompt the LLM. Go back to the Mapping tab, fix the rule, save, return.
Playground consumes 1 AI Credit per query (same accounting as Copilot). It's your loop-closer before exporting or publishing — if Playground returns the right answer, your agent will too, because they use the same semantic_query implementation.
SHACL validation
Validate agent output (or any external data) against shapes.ttl before ingesting it into a production graph:
from pyshacl import validate
from rdflib import Graph
data_graph = Graph().parse("instance-data.ttl", format="turtle")
shapes_graph = Graph().parse("shapes.ttl", format="turtle")
conforms, _, report = validate(
data_graph,
shacl_graph=shapes_graph,
inference="rdfs", # apply RDFS subclass reasoning
)
if not conforms:
print(report) # human-readable violations
raise ValueError("Data failed SHACL validation")Constraints supported in OntoBoom-generated shapes: cardinality (minCount, maxCount), datatype, enumeration (sh:in), regex pattern, numeric range, string length, and object-property class targets.
2 · Publish
Publishing to Hub
From the Studio editor toolbar, click Publish to Hub. The modal pre-fills:
- Namespace — your auto-provisioned
@your-handle(or another namespace you own). - Slug — derived from the ontology name, kebab-case. Locked after first publish.
- Version — next patch above the latest published. Bump major/minor manually in the field.
- Visibility — Private (default), Unlisted, or Public. Public requires an SPDX license.
Once published you get three URLs for the same artifact:
https://hub.ontoboom.com/@ns/slug— Hub UI pagehttps://api.ontoboom.com/hub/v1/ontologies/@ns/slug@x.y.z— JSON pullhttps://mcp.ontoboom.com/o/@ns/slug@x.y.z— live MCP endpoint
Versions are immutable. Retire a buggy version with yank, not delete — the record stays so downstream consumers see a clear signal.
Three visibility levels. Public: discoverable in Hub search, served anonymously over MCP, indexable by search engines. Unlisted: not enumerated in search, but anyone with the URL reads + serves over MCP. Private: only the namespace owner can read; MCP requires a Bearer obt_ token (see below).
3 · Connect your agent
Every framework below consumes the same two artifacts: tool-schema.json from your OPS export, or the live mcp.ontoboom.com/o/@ns/slug@ver endpoint if you've published. Pick the path your stack already speaks.
LangChain / LangGraph
Path A — wrap your OPS package as a LangChain tool. The full semantic_query implementation auto- generated in the OPS README:
import json, psycopg2
from openai import OpenAI
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
with open("mapping.json") as f:
mapping = json.load(f)
def semantic_query(question: str, db_dsn: str) -> dict:
"""Translate a question to SQL using the OPS mapping rules and execute it."""
rules = mapping["rules"]
schema = mapping["source"].get("schema", "public")
rule_lines = []
for r in rules:
target = r.get("target_label") or r["target_iri"]
table = r["source_table"]; col = r.get("source_column") or ""
ref = f"{schema}.{table}.{col}" if col else f"table {schema}.{table}"
rule_lines.append(f" - {target} ({r['target_type']}) -> {ref}")
sql = OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content":
"Translate to a single SELECT with LIMIT 100.\n" + "\n".join(rule_lines)},
{"role": "user", "content": question},
],
).choices[0].message.content.strip().strip("`")
conn = psycopg2.connect(db_dsn)
cur = conn.cursor()
cur.execute(f"SELECT * FROM ({sql}) AS _q LIMIT 100")
cols = [d[0] for d in cur.description]
rows = cur.fetchall()
conn.close()
return {"sql": sql, "columns": cols, "rows": rows}
agent = create_react_agent(ChatOpenAI(model="gpt-4o-mini"), tools=[semantic_query])
agent.invoke({"messages": [(
"user", "How many investment accounts opened in the last 30 days?"
)]})Path B — connect to the live MCP endpoint via the LangChain MCP adapter. No OPS download; the published version is queried at request time.
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
async with MultiServerMCPClient({
"ontoboom": {
"url": "https://mcp.ontoboom.com/o/@ontoboom/customer-360@0.1.0",
"transport": "sse",
# for private ontologies:
# "headers": {"Authorization": "Bearer obt_your_token"},
},
}) as client:
tools = await client.get_tools()
agent = create_react_agent(ChatOpenAI(model="gpt-4o-mini"), tools=tools)
result = await agent.ainvoke({"messages": [(
"user", "List entity types in the customer-360 ontology"
)]})OpenAI function calling
tool-schema.json is already shaped for OpenAI — just unwrap the tools array:
import json
from openai import OpenAI
with open("tool-schema.json") as f:
schema = json.load(f)
functions = [{
"name": t["name"],
"description": t["description"],
"parameters": t["parameters"],
} for t in schema["tools"]]
response = OpenAI().chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content":
"Find customers with active subscriptions in California"}],
functions=functions,
)
fc = response.choices[0].message.function_call
print(fc.name, fc.arguments)Anthropic SDK (Claude API)
Two integration paths with Claude — direct tool use, or via an MCP server connection. Direct tool use:
import json, anthropic
with open("tool-schema.json") as f:
schema = json.load(f)
# Anthropic's tools format is nearly identical to OpenAI's
tools = [{
"name": t["name"],
"description": t["description"],
"input_schema": t["parameters"],
} for t in schema["tools"]]
resp = anthropic.Anthropic().messages.create(
model="claude-opus-4-7",
max_tokens=1024,
tools=tools,
messages=[{"role": "user",
"content": "Pull customers who opened a support ticket this week"}],
)MCP via the official Python MCP SDK:
from mcp import ClientSession
from mcp.client.sse import sse_client
async with sse_client(
"https://mcp.ontoboom.com/o/@ontoboom/customer-360@0.1.0",
headers={"Authorization": "Bearer obt_your_token"}, # only for private
) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools = await session.list_tools()
result = await session.call_tool(
"describe_entity", {"name": "SupportTicket"}
)
print(result)AWS Strands Agents
Strands is Anthropic + AWS's open-source agent SDK. It has native MCP support, so the OntoBoom endpoint plugs in directly:
from strands import Agent
from strands.tools.mcp import MCPClient
from mcp.client.sse import sse_client
mcp = MCPClient(lambda: sse_client(
"https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0",
))
with mcp:
agent = Agent(
model="us.anthropic.claude-opus-4-7-v1:0", # Bedrock model id
tools=mcp.list_tools_sync(),
)
agent("Which Opportunity stages are defined in this ontology?")GCP Vertex AI Agent Builder
Vertex AI Agent Builder consumes function declarations on the Gemini API. Convert tool-schema.json:
import json
from vertexai.generative_models import (
GenerativeModel, Tool, FunctionDeclaration,
)
with open("tool-schema.json") as f:
schema = json.load(f)
declarations = [
FunctionDeclaration(
name=t["name"],
description=t["description"],
parameters=t["parameters"],
)
for t in schema["tools"]
]
model = GenerativeModel(
"gemini-2.5-pro",
tools=[Tool(function_declarations=declarations)],
)
resp = model.generate_content(
"Customers who churned in Q3 with LTV above 5000"
)
fc = resp.candidates[0].content.parts[0].function_call
print(fc.name, dict(fc.args))For Agent Engine deployment, register the function as a reasoning-engine tool — same JSON-schema parameters, same tool name.
Azure AI Foundry
Foundry Agents accept function tools whose parameters are JSON Schema — identical shape to tool-schema.json:
import json
from azure.ai.agents import AgentsClient
from azure.identity import DefaultAzureCredential
with open("tool-schema.json") as f:
schema = json.load(f)
tools = [{
"type": "function",
"function": {
"name": t["name"],
"description": t["description"],
"parameters": t["parameters"],
},
} for t in schema["tools"]]
client = AgentsClient(
endpoint="https://<your-project>.services.ai.azure.com",
credential=DefaultAzureCredential(),
)
agent = client.create_agent(
model="gpt-4o-mini",
name="ontoboom-semantic-query",
instructions="Use semantic_query to answer questions about the domain.",
tools=tools,
)For MCP-style integration, Foundry's upcoming MCP server support consumes the same mcp.ontoboom.com/o/@ns/slug@ver URL.
CrewAI
Tool wrapper around semantic_query:
from crewai import Agent, Task, Crew
from crewai_tools import tool
# semantic_query as defined in the LangChain section above
@tool("Semantic query")
def semantic_query_tool(question: str) -> str:
"""Answer questions about the domain by translating them
to SQL using the OntoBoom OPS mapping and executing them."""
result = semantic_query(question, db_dsn=DB_DSN)
return str(result)
analyst = Agent(
role="Data Analyst",
goal="Answer questions about customer behaviour grounded in the ontology",
tools=[semantic_query_tool],
)
crew = Crew(
agents=[analyst],
tasks=[Task(
description="How many trial customers converted last month?",
agent=analyst,
expected_output="A number with the SQL that produced it.",
)],
)
crew.kickoff()CrewAI also supports MCP via crewai_tools.MCPServerAdapter; pattern matches the LangChain MCP block above with the same mcp.ontoboom.com/o/@ns/slug@ver URL.
Claude Desktop / Cursor (no-code MCP)
For end-user tools that speak MCP natively, paste the endpoint URL into the client config. Claude Desktop → ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"ontoboom-crm": {
"url": "https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0",
"transport": "sse"
}
}
}Cursor → .cursor/mcp.json:
{
"servers": [
{
"url": "https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0",
"transport": "sse"
}
]
}Restart the client; MCP tools appear in the tool tray.
Shell test of any MCP endpoint:
curl https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0/health
curl -sS -X POST https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0 \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'
curl -sS -X POST https://mcp.ontoboom.com/o/@ontoboom/crm@0.1.0 \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call",
"params":{"name":"describe_entity","arguments":{"name":"Account"}}}'Tools exposed today: list_entities, describe_entity. Both are derived from the manifest at request time, so they always reflect the published version.
Private MCP via Bearer obt_ token
Anonymous traffic only sees Public + Unlisted ontologies. To serve a Private ontology over MCP, include a Bearer token from the namespace owner (mint one at profile → API tokens):
# curl
curl -H "Authorization: Bearer obt_your_token_here" \
https://mcp.ontoboom.com/o/@your-handle/private-slug@1.0.0/health
# Claude Desktop config — note the headers block
{
"mcpServers": {
"ontoboom-private": {
"url": "https://mcp.ontoboom.com/o/@your-handle/private-slug@1.0.0",
"transport": "sse",
"headers": {
"Authorization": "Bearer obt_your_token_here"
}
}
}
}The token must belong to the namespace owner (or an admin). Token-scope-by-project does not apply here — Hub serving is namespace-gated, not project-gated.
Reference
Public REST API
Programmatic access uses Bearer tokens (prefix obt_). Generate one in your profile → API tokens. Token is shown once on issuance; only the hash is stored.
curl https://api.ontoboom.com/api/v1/projects \
-H "Authorization: Bearer obt_your_token_here"
curl https://api.ontoboom.com/api/v1/ontologies/<uuid> \
-H "Authorization: Bearer obt_your_token_here"Full reference (OpenAPI/Swagger): api.ontoboom.com/api/v1/docs.
Something missing or wrong? Open a thread at support or contact us.