Content Inventory & Taxonomy Complexity Estimator
Estimates the overall complexity score of a content inventory and taxonomy system based on content volume, hierarchy depth, cross-linking density, content types, and governance overhead. Use this to plan migration, audit, or redesign efforts.
Formulas
Component Scores (each 0–100):
- Volume Score = min(100, log₁₀(totalItems) / 6 × 100)
- Taxonomy Structure Score = min(100, depth × log₂(branchingFactor + 1) / (10 × log₂(33)) × 100)
where branchingFactor = totalNodes / topCategories - Content-Type Diversity Score = min(100, √contentTypes / √50 × 100)
- Cross-Link Density Score = min(100, ln(crossLinks + 1) / ln(51) × 100)
- Metadata Richness Score = min(100, metadataFields / 50 × 100)
- Governance Load Score = min(100, log₁₀(itemsPerOwner + 1) / log₁₀(1001) × 100)
- Localisation Score = min(100, (languages − 1) / 19 × 100)
- Churn Score = min(100, ln(updateFreq + 1) / ln(53) × 100)
Composite Score = 0.20 × Volume + 0.20 × TaxStructure + 0.10 × TypeDiversity + 0.15 × CrossLink + 0.10 × Metadata + 0.10 × Governance + 0.10 × Localisation + 0.05 × Churn
Audit Hours = totalItems × 0.05 × (1 + compositeScore / 100)
Migration Person-Days = auditHours / 6 × (1 + compositeScore / 200)
Recommended FTEs = max(0.1, (totalItems / 500) × (1 + compositeScore / 100) / owners)
Complexity Bands: Low < 20 · Moderate 20–40 · High 40–60 · Very High 60–80 · Extreme ≥ 80
Assumptions & References
- Volume uses a log₁₀ scale (base 6) so that 1 M items scores 100; linear scaling would compress small inventories unfairly.
- Taxonomy structure complexity follows information-theoretic branching entropy: deeper trees with higher branching factors are exponentially harder to govern (Zeng & El-Gohary, 2014).
- Content-type diversity uses a square-root scale to reflect diminishing marginal complexity beyond ~20 types.
- Cross-link density uses a natural-log scale; 50 avg links/item is treated as the practical ceiling for human-navigable faceted taxonomies.
- Metadata richness is linear up to 50 fields, consistent with Dublin Core extensions and enterprise MDM benchmarks.
- Governance load is measured as items per owner; ratios above 1 000 items/owner are considered unmanageable without automation.
- Localisation multiplier is linear from 1 (no extra complexity) to 20 languages (full complexity).
- Churn is log-scaled against a weekly update cadence (52/yr) as the practical upper bound for manual governance.
- Audit effort baseline of 0.05 h/item is derived from industry benchmarks for content audits (Halvorson & Rach, Content Strategy for the Web, 2012).
- Migration effort assumes 6 productive hours/person-day with a complexity uplift factor.
- FTE recommendation assumes one governance FTE can manage ~500 items/year at baseline; complexity scales this linearly.
- Weights were calibrated against practitioner surveys in the Information Architecture Institute's annual IA practice reports.