Ontology in Information Architecture Explained
Ontology in information architecture (IA) defines the explicit formal specification of concepts, their properties, and the relationships between them within an information system. This page covers the structural mechanics, classification boundaries, professional applications, and known tensions that arise when ontological frameworks are deployed in digital environments. Ontology functions as one of the three foundational components of IA alongside taxonomy and controlled vocabularies, and understanding its distinct role is essential for practitioners designing complex knowledge systems.
- Definition and scope
- Core mechanics or structure
- Causal relationships or drivers
- Classification boundaries
- Tradeoffs and tensions
- Common misconceptions
- Checklist or steps (non-advisory)
- Reference table or matrix
Definition and scope
Within information architecture, an ontology is a formal, machine-readable model that specifies a set of concepts (called classes or entities), the attributes of those concepts (properties or slots), and the logical relationships between them (relations or axioms). The W3C defines an ontology as "a formal naming and definition of the types, properties, and interrelationships of the entities that really or fundamentally exist for a particular domain of discourse" (W3C Semantic Web Activity).
The scope of an ontology in IA extends beyond simple categorization. Where a taxonomy organizes terms in a hierarchy, an ontology encodes why and how those terms relate — including non-hierarchical relations such as is-caused-by, is-part-of, is-co-located-with, and is-contraindicated-with. This expressive capacity enables reasoning, inference, and automated query processing that flat hierarchies cannot support.
Ontologies operate at scales ranging from a domain-specific model covering 50 classes (such as a product catalog ontology for an e-commerce platform) to upper ontologies like SUMO (Suggested Upper Merged Ontology) that encode thousands of concepts intended to be universally applicable across domains.
The information architecture principles governing ontological design draw heavily from formal knowledge representation, a field that spans library science, computer science, and cognitive linguistics.
Core mechanics or structure
An ontology consists of five structural components:
- Classes — The categories or types of things in the domain (e.g.,
Document,Author,Topic). - Instances (individuals) — Specific members of a class (e.g., the document titled RFC 3986 is an instance of
Document). - Properties — Attributes describing classes or instances. OWL (Web Ontology Language), the W3C standard for ontology encoding, distinguishes data properties (literal values) from object properties (relationships between instances).
- Relationships — Typed, directed links between classes or instances. Common relationship types include
subClassOf,partOf,sameAs,inverseOf, and domain-specific relations. - Axioms — Logical constraints that govern valid states of the model, such as cardinality restrictions ("an author must have exactly 1 primary affiliation") or disjointness declarations ("a DocumentSection cannot also be a Document").
OWL, standardized by the W3C in 2004 and revised as OWL 2 in 2009, provides three expressivity profiles — OWL Lite, OWL DL, and OWL Full — each with different computational tractability tradeoffs (W3C OWL 2 Overview). OWL is built on RDF (Resource Description Framework) and uses description logic as its formal underpinning, enabling automated reasoners such as HermiT and Pellet to detect inconsistencies and infer implicit class memberships.
The interaction between ontologies and metadata and information architecture is direct: metadata schemas define what properties are recorded about a resource, while ontologies define the formal semantics of those properties and how they interrelate.
Causal relationships or drivers
The adoption of ontological modeling in IA is driven by at least 4 identifiable system pressures:
-
Query ambiguity — Natural language search fails when the same concept is named differently across content silos. An ontology's
sameAsandequivalentClassrelations allow a system to recognize that "myocardial infarction," "heart attack," and "MI" refer to the same entity, improving recall without relying on synonym lists alone. -
Cross-system integration — Enterprise environments routinely merge content from heterogeneous sources with incompatible schemas. Ontologies provide a shared semantic layer — a mediating schema — that maps divergent vocabularies to a common representation. This is the primary motivation behind the Linked Data initiative championed by the W3C (W3C Linked Data).
-
Reasoning requirements — Applications that must derive new facts from existing data — such as clinical decision support systems or compliance monitoring tools — require formal logic rather than ad hoc keyword matching. Ontologies encode the rules that enable automated inference.
-
Scalability of manual curation — As content collections grow beyond what human editors can curate exhaustively, ontological models allow systems to classify and relate new content programmatically against an established schema, reducing per-item editorial overhead.
The relationship between ontology and knowledge graphs reflects these same drivers: knowledge graphs are the deployed instantiation of an ontological schema populated with real-world instance data.
Classification boundaries
Ontologies are distinguished from adjacent IA artifacts by their formal properties:
- Ontology vs. taxonomy: A taxonomy encodes is-a (subsumption) hierarchies only. An ontology encodes is-a hierarchies plus arbitrary typed relations, property restrictions, and logical axioms. All taxonomies are a subset of possible ontology structures, but not all ontologies are taxonomies.
- Ontology vs. thesaurus: A thesaurus captures three relation types — broader/narrower (BT/NT) and related (RT) — as defined by ISO 25964-1:2011. An ontology is not constrained to these three types and can define unlimited custom relation types with formal semantics.
- Ontology vs. database schema: A relational database schema defines tables, columns, and foreign key constraints but does not support open-world reasoning. An ontology operates under the Open World Assumption (OWA): the absence of a fact does not mean the fact is false, only unknown. Database schemas operate under the Closed World Assumption (CWA): what is not recorded is presumed false.
- Ontology vs. knowledge graph: A knowledge graph is the populated graph structure (schema + instance data). The ontology is the schema layer alone. This distinction is maintained in the W3C RDF/OWL technical stack but is frequently collapsed in product documentation.
Tradeoffs and tensions
Expressivity vs. computational complexity — The more expressive an OWL profile, the less tractable automated reasoning becomes. OWL Full is Turing-complete and undecidable; OWL DL reasoning is decidable but exponential in the worst case. Practitioners selecting between OWL profiles must balance the need for rich semantics against the computational cost of inference at scale.
Consensus vs. specificity — Ontologies intended for interoperability require broad agreement among domain stakeholders, which tends to flatten distinctions that individual organizations need internally. The Gene Ontology (GO), maintained by the Gene Ontology Consortium, represents over two decades of negotiated compromise across 4,000+ annotated gene products (Gene Ontology Consortium) — a process that is both the source of its broad adoption and its occasional coarseness for specific research contexts.
Maintenance overhead — Ontologies require sustained editorial governance. Domain knowledge evolves, terminology shifts, and new entity types emerge. Without a formal IA governance process, ontologies accumulate deprecated terms, broken axioms, and undocumented local extensions that degrade system reliability.
Closed vs. open world — Systems built on relational databases and those built on ontologies can conflict when integrated, because their underlying logical assumptions about missing data differ. This tension is especially acute in enterprise systems that layer a semantic layer over legacy relational infrastructure.
Common misconceptions
Misconception 1: Ontology and taxonomy are synonyms.
Taxonomy is a strict hierarchy of is-a relations. Ontology subsumes taxonomy but additionally encodes typed, non-hierarchical relations and formal constraints. The distinction is not terminological preference — it is a functional difference in what logical operations a system can perform.
Misconception 2: An ontology must cover an entire domain.
Domain ontologies are intentionally scoped. A product ontology for a SaaS application may cover 30 classes and 80 relations while deliberately excluding concepts handled by adjacent systems. Scope boundaries are a design decision, not a failure of completeness.
Misconception 3: OWL is the only ontology language.
OWL is the W3C-recommended standard, but SKOS (Simple Knowledge Organization System), also a W3C standard, provides a lighter-weight vocabulary for thesauri and classification schemes (W3C SKOS Reference). SKOS does not support full OWL reasoning but is widely used for controlled vocabularies and folksonomies requiring basic semantic interoperability.
Misconception 4: Ontologies are primarily an AI concern, not an IA concern.
Ontology design decisions — what classes to define, how to name properties, which relations to encode — are fundamentally information architecture decisions. They determine findability, navigation paths, and search behavior in any system that instantiates them. The full scope of information architecture encompasses ontological modeling as a core structural concern, not a peripheral technical one.
Checklist or steps (non-advisory)
Ontology development process phases (as described in the Noy & McGuinness methodology published by Stanford University's Knowledge Systems Laboratory, Ontology 101):
- Determine the domain and scope — define the competency questions the ontology must answer.
- Consider reuse — evaluate existing ontologies (e.g., Dublin Core, Schema.org, SNOMED CT) before building from scratch.
- Enumerate important terms — collect the vocabulary of the domain without yet assigning structure.
- Define classes and the class hierarchy — establish subsumption relations using the is-a test.
- Define class properties (slots) — assign data properties and object properties to each class.
- Define property facets — set domain, range, cardinality, and default values for each property.
- Create instances — populate at least a representative set of individuals to validate structure.
- Validate against competency questions — confirm the ontology supports the queries defined in step 1.
- Document provenance and versioning — record authorship, revision history, and deprecation policies.
Reference table or matrix
Ontology artifact comparison matrix
| Artifact | Relation types supported | Formal logic | Inference support | W3C standard | Typical IA use |
|---|---|---|---|---|---|
| Flat list | None | No | No | — | Simple picklists |
| Taxonomy | is-a only | No | No | — | Navigation hierarchies |
| Thesaurus (SKOS) | BT/NT/RT + labels | Minimal | Limited | SKOS (W3C) | Controlled vocabularies |
| Ontology (OWL Lite) | is-a, limited properties | Description logic subset | Decidable, low cost | OWL 2 (W3C) | Lightweight domain models |
| Ontology (OWL DL) | Full typed relations + axioms | Description logic (SROIQ) | Decidable, high cost | OWL 2 (W3C) | Enterprise knowledge models |
| Ontology (OWL Full) | Unrestricted | Full first-order logic | Undecidable | OWL 2 (W3C) | Theoretical; rarely deployed |
| Knowledge graph | Instantiated ontology + data | Varies by schema | Varies | RDF (W3C) | Semantic search, linked data |
References
- W3C Semantic Web — Ontology — World Wide Web Consortium definition and standards context
- W3C OWL 2 Web Ontology Language Overview — W3C Recommendation, second edition
- W3C SKOS Simple Knowledge Organization System Reference — W3C Recommendation
- W3C Linked Data — W3C Semantic Web Activity
- Gene Ontology Consortium — Canonical example of a community-maintained domain ontology
- Noy, N.F. & McGuinness, D.L. — Ontology Development 101 — Stanford Knowledge Systems Laboratory, Stanford University
- ISO 25964-1:2011 — Thesauri and interoperability with other vocabularies (International Organization for Standardization)