This article proposes that an Enterprise Conceptual Data Model derived from an authoritative Domain Ontology is not only an isomorphic submodel but also an optimal relational design.
We present empirical support that the transformation is a structure-preserving map from Ontology Web Language to the Entity-Relationship Model with a one-to-one correspondence of the elements.
Taking the example of FIBO, the Financial Industry Business Ontology, we perform a quality assurance review of the derived Financial Industry Business Data Model, FIB-DM, starting with the generalization hierarchy rules in this part of the article, followed by entity and relationship rules.
There are two premises. First,
The conclusion is that FIBO already reflects the data modeling rules and therefore FIB-DM is the optimal relational design for the Financial Industry.
The headline is an homage to Topbraid Maestro, my favorite ontology development platform. Google “, ontology data model transformation,” finds Irene Polikoff’s 2011 post. Eight years ago most readers and I considered ontology a branch of philosophy – not a computer language.
Note: This article and all diagrams apply to the Open Source core version of FIB-DM in full.
There is an isomorphism between a subset Ontology Web Language (OWL) and the Entity-Relationship Model (ERM). OWL has a higher expressivity and semantic richness than the ERM. In other words, we cannot express complex axioms in a data model. However, we can preserve the structure of classes, data properties, and object properties as entities, attributes, and associations in a Conceptual Data Model (CDM). The morphism is bidirectional. That means we can preserve the structure of a CDM through ontology classes and properties.
Configurable Ontology to Data-model Transformation (CODT)
The morphism is implemented in CODT, with configuration settings to generate a PowerDesigner conceptual data model.
My article and video explain the mapping, transformation rules, and structure of the ontology-derived data model in detail.
The empirical test of examines list reports source and target:
- All FIBO classes are in the FIB-DM list of entities?
- Data properties transformed to CDM data items?
- Do we have all object properties transformed into CDM associations?
- Structural, are the FIB-DM attributes on the correct entity as indicated in FIBO data property domain and class restrictions?
- Do the relationships correctly reflect the object property domain, range, and restrictions?
finallythe topic of this part, do the generated inheritances (subtypes) represent the FIBO sub properties?
We confirm the structural equivalence comparing ontology and data model profile, the object counts in both Topbraid and PowerDesigner. Then we compare the list reports of source and target objects.
Module and data equivalence
A test for a logically correct ontology module is whether queries executed on the module produce the same results as performed on the initial ontology. Database administrators use the same test to validate data subsets.
The Bank Regulation Ontology is an operational extension of the FIBO. The ontology adds classes and properties to hold FFIEC Call report data. As presented at the 2017 FIBO conference, we load data into the extended FIBO and run SPARQL queries to reproduce parts of the regulatory report.
To prove that Bank Ontology and derived Bank Data Model are the same we:
- Use CODT to transform the ontology into a CDM
- Generate a Physical Data Model (PDM)
- Create the schema on an RDBMS
- ETL our FFIEC sample data onto the database.
The load is the first test, whether the derived schema supports the data requirements in full.
- Write SQL queries to extract Call Report data.
- Execute the SQL queries on the RDBMS and compare the results with the SPARQL queries on the RDF-store.
This is also the Proof of Concept for the Semantic Enterprise Architecture.
The diagram shows the ontology, a business conceptual enterprise model at the apex. We generate the conceptual, logical, and physical data model.
Both implementation RDBMS and RDF-store must be able to hold the same data. Both must return the same information.
Bijective transformation – ontology derived from the data model.
An isomorphism must be bijective on the underlying sets. That means, FIB-DM transformed back to OWL must be equal to the FIBO
The patent-pending CODT transformation uses metadata sets for the source ontology, interim generic ER, and the target model and PowerDesigner tool. The metadata sets are bi-directional by design. We can reverse-engineer a data model into an ontology.
The roundtrip transformation “FIBO” has all classes, object-, and data properties. It will only lack the complex class restrictions on values and property chains that are beyond the expressivity of the relational model.
We can compare the two ontologies to prove that they are indeed the same.
Recap – FIBO, and FIB-DM are isomorphic
In summary, FIB-DM is a complete structural representation of the FIBO, and the morphisms between ontology and data model are bijective.
- We show empirically, comparing list reports that FIB-DM represents all the FIBO model objects.
- An ontology reverse-engineered from FIB-DM is a module of the FIBO.
- We instantiate both FIBO and FIB-DM and show that both databases load the complete business data and produce the same query results.
Quality assurance review of the data model
We have shown that FIB-DM is an isomorphic representation of the FIBO. That doesn’t prove that FIB-DM is suitable as and enterprise data model.
In the beforementioned Call Report example, many poorly designed RDBMS instances still hold financial data and produce regulatory reports. A lousy database schema impacts performance and invites data abnormalities and inconsistencies. The same holds for poorly designed ontologies on RDF-stores.
Therefore to assure quality, we need an expert review of the Financial Industry Business Data Model.
Premise – FIBO is authoritative
The Enterprise Data Management Council (EDMC) the independent authority for best
(Enterprise Data Management Council: https://edmcouncil.org/)
The Financial Industry Business Ontology (FIBO) is a business conceptual model developed by our members of how all financial instruments, business entities and processes work in the financial industry.
The EDM Council is the Global Association created to elevate the practice of Data Management as a business and operational priority. The Council is the leading advocate for the development and implementation of Data Standards, Best Practices and comprehensive Training and Certification programs.
In my opinion, FIBO is superior to vendor models for these reasons:
- The EDMC employs distinguished semantic experts like Mike Bennet, the “father of the FIBO” and Dean Alemang, the “working ontologist.“
- Leading international financial institutions provide domain experts participating in the FIBO content working groups.
- The FIBO has the most extensive documentation with concise definitions and annotations.
- Production releases undergo a rigorous review, including public comment, testing, and Object Management Group standardization.
Consequently, the quality of the FIBO design is assured, and I presume the source ontology is (almost) free of defects.
Premise – data modeling rules and best practices apply to ontologies.
The FIBO is a business conceptual model and so is an enterprise-level CDM. Conceptual models define concepts of interest to the business, their delineation, taxonomy and relationships to other concepts.
- The Logical Data Modeler specifies business concepts as entity-types, that derive into tables with data records.
- The ontologist specifies business concepts as classes. They are sets of instances (individuals, members) that match the class restrictions.
The proposition is that independent of the model notation, OWL vs. CDM, the business concepts are the same.
To disprove the proposition, the expert review team must identify design defects in FIB-DM.
Financial Industry Business Concepts
Part one of this article reviews the major FIB-DM entity hierarchies.
In the ontology, a fundamental class must be a class directly derived from The Thing. A Basic Concept in the data model must be an ultimate supertype. In other words, the entity must not be a child in an inheritance.
Forty FIB-DM entities, derived from FIBO fundamental classes fulfill the condition. I arbitrarily designated the sixteen Basic Concepts as ultimate supertypes with a lot of subtypes and associations: Account, Agreement, Arrangement, Autonomous Agent, Commitment, Contractual Element, Currency, Document, Legal Construct, Location, Occurrence Kind, Product, Reference, Service, Thing In Role, and Time.
We examine the Autonomous Agent hierarchy as an example. The diagram below is a Scalable Vector Graphic (SVG). You can right-click and open the image in a separate tab. There you can zoom in and out and scroll.
An agent is an autonomous individual that can adapt to and interact with its environment.(FIBO class definition)
The subtypes of Autonomous Agent are Person, Legal Person, Automated System, and Organization.
The Autonomous Agent hierarchy defines what an agent is – not what the agent does.
For example, Person entity-type is for a single individual human being. The various roles, employment positions of that person are in a different FIB-DM concept hierarchy, the Agent in Role.
FIB-DM has different conceptualization than the IBM Banking and Financial Markets Data Warehouse Model (BFMDW). The IBM model places a common supertype, the Involved Party on top of Agent and Role. BFMDW famously has nine data concepts – FIB-DM has sixteen.
Generalization Hierarchy Rules
The FIBO Autonomous Agent subclass hierarchy comprises of more than eighty classes. With only two exceptions these are primitive classes meant to be asserted by an application or load process.
The question is weather the ontology class hierarchy is optimal for a data model or not. In simple terms, would the expert review team raise defects:
- Missing entities: Based on business requirements, the FIBO documentation the reviewer suggests adding a sibling or intermediate level in the hierarchy.
- Superfluous entities: The Reviewer flags entities as redundant, in other words not supported by documentation.
- Changes to the inheritances.
While there are no Normal Forms for ontologies, data modeling rules at large also apply to ontology design
Each subtype entity must possess one or more of the following characteristics: (1) at least one non-key attribute, or (2) a relationship with another entity that is logically only correct for it and none other.[Reingruber/Gregory]
Can we justify the two subtypes of the Adult entity in the diagram below?
As a conceptual domain ontology, the FIBO has only 150 data properties. Hence FIB-DM is an entity-level model with as many attributes. The reviewer considers if attributes in support of the subtype are conceivable or even better supported in the documentation. The Incapacitated Adult should have a Date of Incapacitation, and the Legally Capable Adult a Date of Legal Capability. We can add a simple Date attribute or a relationship to the Date entity – FIBO and FIB-DM support both design patterns. Once we derive a Logical Data Model from FIB-DM, we will attribute it and revisit the justification for the subtypes. If there is no supporting attribute or relationship, then we remove the entity.
The very same considerations are under review for the ontology, and we can restate Reingruber/Gregory
Each subclass must possess one or more of the following characteristics: (1) at least one data property, or (2) an object property that is logically only correct for it and none other.
An operational ontology derived from FIBO must justify primitive subclasses with properties. Otherwise, the load process or application would not know which class to instantiate.
A supertype entity or generalization hierarchy should be constructed under the following conditions: (1) a large number of entities appear to be of the same type, (2) attributes are repeated for multiple entities, or (3) the model is continually evolving.[Reingruber/Gregory]
The Autonomous Agent hierarchy is so detailed that reviewers are unlikely to report entities missing in the CDM. Conceivable in the example below, we may find attributes repeated for the Privately Held Company and Joint Stock Company. Such attributes would justify a Non-Public Company entity.
Attribution may also indicate a change in the inheritance. For example, making the Joint Stock Company a specialization of the Privately Held Company.
Again, the same design consideration apply to the ontology class hierarchy:
A supertype class or generalization hierarchy should be constructed under the following conditions: (1) a large number of classes appear to be of the same type, (2) data properties are repeated for multiple classes, or (3) the ontology is continually evolving.
In other words, there are concepts that only apply to the CDM, but not to the ontology.
Every supertype must be associated with or contain a subtype discriminator.[Reinberger/Gregory]
In the Logical Data Model, the subtype discriminator is an attribute on the supertype, that indicates which subtype(s) applies. For example, a Stock Corporation Type indicates the category of Stock Corporation.
We do not require subtype discriminators in the Conceptual Data Model.
The less rigorous CDM opens an interesting question beyond the scope of this article: Is there an isomorphism between operational ontologies and Logical Data Models? I don’t believe, discriminators, unless they are business codes, are needed in an ontology, because of the Open World Assumption and restrictions on the subclasses.
A subtype entity can only be a member of the set of subtypes for one generalization relationship.[Reinberger/Gregory]
Multiple Inheritance makes sense from a business perspective, and as a conceptual model, FIB-DM keeps multiple supertypes. However, the LDM permits only one supertype. Note that object models also allow multiple inheritances. Logical Data Modeler can use the same techniques to resolve multiple inheritances.
The FIB-DM Introduction class explains multiple inheritances in the CDM – a Central Bank is both a Bank and a Monetary Authority – and patterns to resolve during CDM to LDM transformation.
A valid multiple inheritance child must-have characteristics of the supertypes – not merely reflect multiple taxonomies.
The Autonomous Agent hierarchy has two major nodes with multiple inheritances, Polity, and Legal Entity, as well as some hybrid business entities.
The Legal Entity is a legal person that is a partnership, corporation, or other organization having the capacity to negotiate contracts, assume financial obligations, and pay off debts, organized under the laws of some jurisdiction.
The diagram bellows shows Legal Entity to be both a Legal Person and a Formal Organization.
The supertype Legal Person is any entity which can incur legal obligation and can be sued at law. The Formal Organization is an organization that is recognized in some legal jurisdiction, with associated rights and responsibilities. Examples include a Corporation, Charity, Government or Church.
Both the Legal Person and Formal Organization meet the supertype justification. In other words, their subtypes share common attributes. All attributes apply to the Polity, hence we have a valid poly-hierarchy.
The Polity is a legal person that is a supranational entity, crown, state, or subordinate civil authority, such as a province, prefecture, county, municipality, city, or district representing the people of that entity. The Polity breaks down to Municipality, Sovereign State Supranational Entity.
In the diagram below, the Polity is both a Legal Person and a Government Body.
The Government Body is a formal organization that is an agency, instrumentality, or another body of supranational, national, federal, state, or local government, including certain multijurisdictional agencies and departments that carry out the business of government.
Note that Legal Entity and Polity share the same supertype Legal Entity. However, the second supertype of Polity, the Government Body is a specialization of the Formal Organization.
Empirical evidence supports that the Configurable Ontology to Data-model Transformation preserves structural representation and bijection between ontology and conceptual data model. Hence, FIB-DM is an isomorphic subset of the FIBO.
We showed that data model generalization rules also apply to ontology class hierarchies. Accepting the premise, that FIBO is authoritative implies that said generalization rules have already been reviewed and underwent the extensive quality assurance. Hence follows that FIB-DM passes the generalization quality assurance review.
References and further reading.
The canonical version of this article: https://fib-dm.com/ontology-class-and-data-model-entity-hierarchy/
Entities & Packages list report for 2020/Q1 model: https://fib-dm.com/release-2020-q1-informative/
Upgrade to the Commercial Models with 1,968 normative and 4563 informative entities: https://fib-dm.com/full-data-model-upgrade/
Financial Industry Business Ontology (EDM Council website): https://edmcouncil.org/page/aboutfiboreview
The Data Modeling Handbook: A Best-Practice Approach to Building Quality Data Models by Michael C. Reingruber, William W. Gregory
The next installments examine Relationship and Entity data modeling rules.