Semantics for Data Architects Video (part 1)

Industry-standard, Semantic Enterprise Architecture, Configurable Ontology to Data-model Transformation, Open Source

The Financial Industry Business Ontology, FIBO is the authoritative model of Financial Industry concepts, their definitions, and relations. FIB-DM is the conceptual data model derived from the industry-standard ontology. The data model resources are educational diagrams, PowerPoints, videos, and supporting tools in MS-Excel and Visio. Semantics for Data Architects is the education module for Finance, Business, and other non-technical users. Part 1 formulates the chasm between semantic and conventional data management and FIB-DM as a bridge. We discuss use cases of leveraging FIBO for relational design and the limited support in standard tooling. The presentation provides a technical overview of the Configurable Ontology to Data-model Transformation (CODT). Part 1 closes with an overview of the OpenSource FIB-DM core.

You can read the presentation or download the PowerPoint here.

Rough transcript

Hello, and welcome to Semantics for Data Architects, the Financial Industry Business Data Model, FIB-DM. This is an introduction to the ontology-derived data model for ontologists and data architects. While the underlying PowerPoint presentation is one deck, I split the videos up into two parts: This first part covers the FIBO as the Financial industry-standard, the vision, the goal of a Semantic Enterprise Architecture, and the ontology-derived open-source data model as a way to achieve that goal.

FIBO is the authoritative model of financial industry concepts, definitions, and relations. The enterprise data management council, the EDMC is a global association of more than 20 financial institutions. The EMC advocates data management best practices, the development, and implementation of data standards. EDMC members developed the Financial Industry Business Ontology, FIBO, as a business conceptual model. More than 1,600 classes detail financial instruments, business entities, and processes.

You work at a Financial Institution and already embraced model-driven development, industry standards, and reference models. As a Finance business stakeholder, you would start with the previous deck, Semantics for Managers. However, if you have a working knowledge of entity-relationship diagrams and ontology graphs, then this module is also for you. The main audience is Data or Application Architects experienced in Enterprise Reference Models. You may have used FIBO design patterns and definitions. Finally, as an ontologist with an in-depth understanding of the FIBO, you already used the reference ontology for your design, and you want to spread the adaptation of the FIBO across the enterprise.

About myself, I started working 20 years ago at the German Stock Exchange and then moved to America, where I worked at Reuters, Credit Suisse. Then finally, I joined IBM and became an IBM Software Group consultant for the Business and Financial Markets Data Warehouse Model, BFMDW. In that capacity, I was at 45 banks in North America, Europe, and Asia, implementing and customizing the industry model. Subsequently, I implemented BFMDW at Citi and Deutsche Bank. Deutsche Bank was where I turned from being a data architect and became an Ontologist. And since then, I presented at FIBO conferences. About my company, Jayzed Data Models, is a US consulting company incorporated in 1999. Jayzed holds the copyright to the Financial Regulation Ontologies, and the Semantics Compliance trademark.

There is a chasm between semantics and conventional data management. The EDMC specified the FIBO in Ontology Web Language, OWL. This is the right decision to use the power of ontologies to specify a Business Conceptual Model. However, on the conventional side, OWL needs highly specialized ontologists. FIBO is comprehensive with detailed coverage of business entities, loans, securities, derivatives, and indicators. However, many banks and investment managers don’t have the required expertise in-house. Some large financial institutions already implemented the FIBO. They have the specialized databases RDF- or Triple- Stores and deployed FIBO on these databases. They have ontologists inhouse. However, even those, IT departments must still support and design conventional databases. FIB-DM is a bridge across that chasm, and it unites ontologists and data architects.

The ontology transformed into a data model leverages FIBO design for relational databases. On the left-hand side, we have the triple store –  right-hand side relational databases. FIBO,  expressed in RDF/OWL, gets deployed on the RDF store. In the conventional world, we have data models that we physicalize: extract the schema and instantiate the tables on our DBMS. This is where the Configurable Ontology to Data model Transformation comes in. It enables us to convert an ontology into a Logical Data Model. An implementation of that is FIB-DM, an isomorphic twin of the FIBO.

What were the challenges that I had as the FIBO expert and ontologist trying to leverage the industry-standard? One example, two years ago, was a New York bank. They needed a schema for a new Security Master system. Well, as a FIBO advocate, and knowing the FIBO modules for Securities and Derivatives, I was trying to leverage the FIBO for the design of the logical data model. However, the challenge was that the data architects at the bank; of course, they are not familiar with RDF/OWL. They have never used the ontology development platforms, Protégé or Topbraid. A workaround was that the ontologist, me, writes SPARQL queries to extract metadata into MS-Excel spreadsheets so that the architecture modelers could review the design and definitions. Needless to say, that that was very cumbersome. Then in last year, a Connecticut Alternative Investment Manager used the Hedge Fund Ontology to process the Security and Exchange Commission’s “Form Private Funds.” They simply wanted to have the data also available on a relational platform. The challenge was simply to convert an operational ontology of some 200 FIBO and hedge fund specific classes. The workaround there was manually transcription of graphs into ERWin diagrams. Here again, that was a very tedious and error-prone process. It took me the week re-entering ontology graph from TopBraid into the ERWin data modeling tool. So, I went looking for tooling support. However, Protégé nor Topbraid. have an export into logical data models, and the most widely used data modeling tools, ERWin and PowerDesigner, they do not have an import for RDF/OWL. Only two less widely used data modeling tools, Sparx Enterprise Architect and IBM InfoSphere Data Architect that can import RDF-XML One update happy end, in September the said New York City Bank downloaded FIB-DM-Core. Nowadays, facing was the same challenge; they can simply examine the FIBO  as a data model.

This is a look at FIBO import into Sparx Enterprise Architect. Sparx does have some support to import RDF/OWL files. However, the challenges are Number one, that the import takes the whole Uniform Resource identifier, the web address, as the class name, which is not suitable for logical data model naming convention. Furthermore, data properties get reverse-engineered into pseudo-classes. In a data model, we need data properties as attributes. Finally, the import takes class restrictions and converts them into anonymous classes, and these cannot be implemented or instantiated in a data model anyhow. They just become more boxes in an already cluttered diagram. Finally, if we look at the properties of a reverse-engineered class, well, we basically don’t see anything. The import does not convert the extensive FIBO documentation. We do not get tagged values for FIBO annotations. This is why Sparks is a great tool. The reverse-engineered OWL files maybe have some use for ontologies who want to visualize a graph. However, they are not useful for our purpose of having a FIBO- based relational data model.

What we want to see in the FIBO data model is the code made up of FIBO prefix and the English name. We want to see the data properties as attributes.  Then we want to be able to expand the model (Let me pull this over.), so we can add on the parents the children. We can add on the related Person’s Address, the Citizenship. With that, we can add here. If we have the Citizenship, we look at related elements, we add the country, so this is a logical data model. We see our entities generalizations, and we see associative entities. In this case, the “has citizenship” association, which relates a person to the country of Citizenship. Furthermore, when we now look at the properties of the Person, what we see here is it is a stereotype Ontology Class. We see the alias. This LDM English name, as per normal naming convention, capitalized, words separated by spaces. Then we see the CODT profile. It has all the FIBO definitions and annotations. Finally, what we see here is with the ontology import; for instance, we see the Restrictions. That was inferred into anonymous classes. We do not discharge this information. It is added as a tagged value for the benefit of downstream physical modelers and developers. Here we see the restrictions on the entity Person.

So the Financial Industry Business Data Model, FIB-DM, is the FIBO in Power Designer and other modeling tools. We have just seen FIB-DM in Sparx Enterprise Architect. Over a hundred users downloaded the model since it launched. The FIBO is also in ERWin, E/R Studio, Visual Paradigm, and other data modeling tools. As a data modeler, you get a conceptual data model of 1875 entities, complete FIBO definitions,  annotations, and axioms (the business rules). Data architects leverage the full content of the industry standard. Very importantly, we have a common language and design patterns for Semantic and Relational databases.

Looking at Semantic Enterprise Architecture, we can categorize artifacts by they use, their type, and the level where they apply. We have FIBO, the ontology. Out of FIBO, we implement RDF databases. With FIB-DM, we have a Business Conceptual Data Model applied at the enterprise level. Out of FIB-DM, we can generate the Physical Data Model and deploy the schema on an RDBMS. Furthermore, another twin, FIBUM the UML model. That is the Sparx file that we just looked at. Besides relational structures, we can generate it to classes or for Java code, and in the future,f you also should derive messages and process models from a Common Conceptual Model, from an ontology, at the apex of Semantic Enterprise Architecture.

Semantic Model-Driven Development means that we start with conceptual models, be it the FIBO as a domain ontology, or FIB-DM conceptual model. Out of the conceptual, we drive the generation of Logical Data Models for messages, processes, data, and objects. Out of the logical models, we can create physical models. Again, the ontology at the apex of Model-Driven Development ensures the common names, definitions, and design patterns across the enterprise. For mid-sized financial institutions without semantic technologies yet, the advice is: Adopt FIB-DM as a compatible strategic Enterprise Model. In other words, once you make the transition to employ Semantic Technologies, RDF- stores, and so on, you’re already familiar with the industry standard. Large Institutions can use CODT to transform their in-house ontology into data models for downstream at implementation.

Transformation principles and considerations for a derived data model. We looked at what we don’t want in a derived model and how we want a data model to look like. The principles are, first of all, that the model must be practical, the model must be complete. We don’t want to miss any information from the ontology. The model must be fully documented should have diagrams to depict Subject Areas and Design Patterns. Finally, the model must map back to its source, the ontology.

The Configurable Ontology Data model Transformation has a configuration file. In that file, we have the source ontology with connection parameters, a specification of the target model. We have object naming rules. In our case here, we generate the data model object code from the ontology prefix and the local name. The name is an UnCamel as per the naming convention for the local name. We have triggers of what we want to do with anonymous and equivalent classes, how we want to transform object properties, data properties, and annotations. From

FIBO to FIB-DM to FIBUM –  how does CODT work? The Configurable Ontology to Data model Transformation is basic ETL. We extract metadata from the source ontology, transform ontology metadata into conceptual data model metadata. Then we load the metadata into the data modeling tool, in this case, Power Designer. The extract process runs SPARQL on the ontology to get the metadata. PowerDesigner has an import facility for Excel workbooks.

The transformation is a two-step process, using the patent-pending metadata sets. The extract populates ontology metadata sets for classes, objects, data properties, and annotations. Step one transforms the ontology metadata and populates some generic E/R representation. The tool-specific metadata set is in PowerDesigner format. We serialize in MS-Excel and directly load into the tool. Step two is a simple conversion from E/R to PowerDesigner objects, properties, and extended attributes. The same then applies if we have a different target model. For Sparx, we just replace the PowerDesigner with the Sparx tool-specific metadata set in our transformation settings for Domain Ontologies. Now, CODT, that is a patent-pending Configurable Ontology to Data model Transformation. It enables the ontologist and data architect to control the transformation process and mapping.

How do I set this configuration for a high-level domain ontology in order to generate a conceptual data model? Anonymous classes do not transform into entities. Anonymous classes in ontology are classes that have not been named, which is implied only by class restrictions. The setting here is not to create entities, not to create boxes in the diagrams.  This is simply to remove the clutter of entities that never become Physical. FIB-DM preserves anonymous classes used in class restriction and the documentation., but we do not create pseudo-classes in the diagrams. Names transform from OWL Camel-case camera to LDM notation. In other words, English names with capitalization and spaces. For example, the FIBO “DepositoryInstitution” that is camel-case as we use it in ontologies becomes “Depository Institution” in the EDM naming convention. Object properties transform to Associations in PowerDesigner and Associative Entities. They are not simple relationships. This preserves the semantically important object property hierarchy and resolves Open-World properties. Domain and range, class restrictions, they determine the parent and child entity related to the Association or Associative Entity. And it was basically the two “NOTs” was the reason why I developed CODT. Because I tried with the out-of-the-box, one shoe fits them all transformations. And I found the model not useable as an EDM because it would be cluttered with anomalous entities, or it would not properly reflect the transformation of object properties.

This is how we get from FIBO to FIB-DM. On the left-hand side, here we have an ontology graph. This is an example from the FIBO. It is about a bank account and related to the bank account. It is provided by some Bank, which is a subclass of the Depository Institution. Furthermore, it says this bank account is identified by exactly one bank account identifier. That’s the account number. This is what comes out of it: A Conceptual Data Model. Here, these are the mappings from the ontology graph to the conceptual model. It is what has been mapped and what’s being generated. Classes here like the Depository Institution becomes an entity in the data model. The subclass relationship transforms two subtypes in the E/R model. Object properties like here become associative entities. Object properties transform into Associations. Class restrictions, domain, and range determine the relationships/Association links. There are the cardinalities. When I first ran CODT on the FIBO, I found that a Domain Ontology generates a perfect Conceptual Data Model. Simply like this fragment, hereof, the data model is the best representation of the bank account, its provider, and ID. I’d say, as a data modeler, I wouldn’t design this any different, and probably neither would any data modeler in this class. The same holds for the rest of the FIB-DM model. The two are really equivalent. Furthermore, there are no missing and no superfluous entities and relationships in the design. The two things are the same, and it’s not really surprising because we’re we are analyzing the same business case, the same business rules. Classes in ontology perfectly transform two entities in the data model; object property is perfectly transformed into associations and associative entities.

FIB-DM Core is Open-Source. It’s released under GNU General Public License 3.0. This is an Open- Source Initiative recommended license. You can download the FIB-DM on the website. FIB-DM is a thousand-entity open-source model.

Breaking down these numbers in the pie chart is basically the number of classes by the FIBO module. What we see here: In the Commercial License, we have Derivatives, Indicators, Securities. FIB-DM Core. Open-source part consists of one thousand twenty-nine classes and if I break down what is in the open-source model, here we have Foundation with three hundred eighty classes Finance Business and Commerce with three hundred eighteen, and then finally, Business Entities with two hundred sixty-nine, and some smaller modules for reference codes languages, countries, and currencies. Then, Semantic Metadata, this is basically for doing annotations.

FIB-DM core is the self-contained, standalone data model. What we see here, these are the main FIBO modules. At the center we have Foundation, we have some Generic modules that foundation imports. t We have Business Entities and FBC. These ontology models, or in the data model, they’re called packages, they import file content. In turn, Securities imports Business Entities and FPC. Finally, Indicators and Indices, and Derivatives they import the Securities module. When I classify these modules, the top layer, these are Upper Ontology constructs; .these are Generic Data Model constructs. Then we have the Domain Core, Foundation, Business Entities, Finance Business & Commerce, and finally, we have Extensions, more specialized areas of the Financial Model. Free and open-source is everything above the line, and the extensions are available for Commercial Licensing. Soon to come, more packages still in FIBO development. The FIBO is still growing. There are future modules that are still in FIBO development. Once they are released for FIBO Production, that is when I incorporate it when I transform them into Data Model Packages. What’s coming up is Loans, that is loons and mortgages, Market Data, Corporate Actions & Events, and Collective Investment Vehicles, also known as Investment Funds. Thanks for watching it. This is a good opportunity to take a break. Then please continue watching part 2, FIB-DM structure. That is a demo of the Financial Industry Business Data Model in the data modeling tool, PowerDesigner.