The Configurable Ontology to Data model Transformation, CODT Tutorial.
The Configurable Ontology to Data Model Transformation (CODT) is the technology that created the 4,568-entity FIBO Data Model. The US Utility Patent filed to weeks ago opens CODT for Financial Institutions, who already customized the industry-standard ontology and need a data model reflecting their extensions (most of them are global banks).
The CODT tutorial addresses challenges for Semantic Centers of Excellence leveraging ontologies for Data Management. Semantic and conventional data management using the same language instead of repeating silos. The vision is Semantic Enterprise Information Architecture with ontologies at the apex and derived data, message, process, and object models.
The first part provides an overview of the ontology derived model, a high-quality conceptual data model, resolving issues with old-style parsing transformation approaches. In detail, we examine the features of the data model, sophisticated mappings, lineage, and ontology annotation properties. We look at the Normative and Informative FIB-DM models, the fifteen business concepts, and the Open Source vs.—commercial license.
This second part of the tutorial is a deep dive into the ETL (Extract, Transform, and Load) inspired approach. Patent-pending Metadata Sets for Ontology, generic Entity-Relationship model, and the specific Data Modeling Tool are self-populating. We extract Ontology metadata using SPARQL queries, and MS Power Query, Excel formulas, and Visual Basic for Application code for transformations. CODT also operates in reverse mode, transforming Data Models into RDF/OWL. The second part closes with an overview of the license, pricing, and the Proof of Concept.
At the end of the video, click on the “Atlantic CODT meets MS-PowerQuery” image to watch the second part.
After watching the whole video, download the PowerPoint for your reference.
Part 1 – Introduction, SEIA, FIB-DM
Hello, and welcome back to FIB-DM. Today I’m speaking about Semantics for eXtra Large Banks, the Configurable Ontology to Data model Transformation, CODT.
It’s an overview of Semantic Enterprise Information Architecture and Model-driven Development with the ontology at the apex of the hierarchy.
One word here, this XLB, eXtra Large Bank – it’s a placeholder. You are welcome to modify the slides and replace it with your bank’s logo and name.
You already embrace RDF/OWL and the FIBO. You have a semantic Technology Center of Excellence, Triple-stores in production, you use and support the development of the industry-standard ontology, and you downloaded and evaluated the FIBO Data Model.
The CODT patent filing enables full disclosure of the transformation technology.
What are the challenges for Semantic Centers of Excellence? We already have implemented extended and customized the industry-standard ontology; however, 95% of your bank still runs on relational databases using data models.
XLB has highly qualified ontologists and data scientists. Data architects have the FIBO data model, but they cannot leverage the work of their semantic COE colleagues. The risk is that semantic implementations become yet another data silo, using a different language than the rest of the organization, and with that, impeding the data integration.
The vision is Semantic Enterprise Information Architecture, SEIA. When we look at the architecture, we can look at the use, the type, and the level where it applies.
Traditionally, we have conceptual, logical, and physical data models. They deploy relational database management systems. What’s new is we have the FIBO, with our extensions, and that gets instantiated on DRF-Stores. Now, with CODT, we can transform our ontologies into data models and already into object models, and in the future, also to message and process models. That is SEIA: The ontology at the apex and derived models for data, object, message, and process.
The way to get there is Semantic Model-Driven Development, which has been around for some time. We convert Logical into Physical data models, and we can convert between data and object models. So, we have the RDF/OWL, the ontology on the triple store. We have data models – out of it, we generate database schemata, and now with the Configurable Ontology to Data model Transformation, we can transform ontologies into data models, and reverse-engineer data models into ontologies. That is FIB-DM.
One word, the deck, is called Semantic for extra-large banks. Asset size is a poor proxy for semantic sophistication. Semantics for Data Architects was the name of the first FIB-DM education resource and became a catchphrase. After that followed Semantic for Mid-sized banks, for Large banks, and so on. However, some Financial institutions like Hedge Funds are very advanced, and many mid-sized banks on FIB-DM are now building out ontology capabilities. They are now just starting out using the FIBO ontology besides the data model. So, CODT is for Financial Institutions who are using and extending the FIBO, many but not all of them are “extra-large banks.”
The intended audience for this webinar is the same as the designated Proof of Concept team: the Finance Management side would have someone who has a working knowledge of E/R and ontology diagrams. You have to be authorized to sign non-disclosure and license agreements. An Ontologist with an in-depth understanding of the FIBO and your in-house ontologies: You are the one who wants to spread the adaptation across your enterprise. You are well versed in RDF/OWL and SPARQL. The data architect with experience in enterprise reference models; you evaluate, and you want the industry-standard FIB-DM. You are an expert in your data modeling tool and its import functionality. Finally, er need developers, MS-Excel power users with experience in Visual Basic for Applications, Power Query, and M-language.
About myself, well, I have 20 years of industry experience as a data architect and ontologist at leading Financial Institutions and service providers. In particular, for seven years at the IBM Software Group, I was a Banking and Financial Markets Data Warehouse consultant. I had to deploy and customize the IBM data model at 45 banks in North America, Europe, and Asia. Then, for four more years, I directly consulted Citibank and Deutsche Bank on BFMDW implementations. At Deutsche, someone pointed me to the FIBO seven years ago, and since I became a contributor reviewer and speaker at FIBO conferences. Well, my company, Jayzed Data Models, is a U.S. consulting company, incorporated in 1999. Jayzed holds FIB-DM copyrights and is the designated assignee of the CODT patent.
Here are the origins of CODT and FIB-DM: A few years ago, at a New York bank, they needed a new schema for a Security Master System. Of course, as a FIBO advocate, I tried to leverage the industry-standard ontology for the Logical Data Model. The challenge was that data architects are not familiar with RDF/OWL; they have never used Protégé or TopBraid. The workaround was for the ontologist, for me, to write SPARQL queries to extract metadata into MS-Excel spreadsheets, so that the architects could review the FIBO design and definitions. At a Connecticut Alternative Investment Manager, they had the Hedge Fund Ontology, and the task was converting my 200 FIBO and hedge fund specific ontology classes into a Logical Data Model so they could put the data on an RDBMS, in addition to the Triple-store. Here the workaround then was a manual transcription, in other words, typing in graphs into ERWin and some metadata extract and import. Both cases were tedious work, manual work. I tried out existing tooling, but it chokes on very large ontologies, and it does not derive a useful data model. In other words, ontologist and data architects have to copy and paste manually. That’s why I developed a better transformation and the FIBO data model.
“Atlantic,” the latest version, is a way to Enterprise Information Architecture and Model-Driven Development with the ontology. The 2020/Q2 release, it has 4500 entities. It is the World’s Largest Data
Model. We have the Configurable Ontology to Data Model Transformation, CODT.
The FIBO is more than a Knowledge Graph. The Enterprise Data Management Council, on their website, says, “the Financial Industry Business Ontology, FIBO, is a business conceptual model developed by our members of how financial insurance business entities and processes work in the financial industry.” The council and its members correctly decided to define their business conceptual model in Ontology Web Language, in OWL, because it has superior semantics in the notation. FIBO conceptualization and relations are fully applicable for lower semantic taxonomies, concept maps, object-, and data models. FIB-DM is a perfect conceptual model. The links here – when you have the deck, you can follow up to read more about these statements in depth.
The FIBO is superior to vendor data models. Almost 600 years ago, Robert II of d’Uzes proclaimed Charles VII king of France. Yet the Involved Party is still an ultimate supertype in numerous reference models and databases. The fibo breaks up that commingled Entity into two Fundamental Concepts, the Autonomous Agent, person, legal entity, and the Thing in Role, the customer, employee, broker, whatsoever. There is conceptually no common supertype above Autonomous Agent and Thing in Role – period.
Instead of eight data concepts, the FIBO has 15 Fundamental Business Concepts. They are all ultimate supertypes in the FIBO Data Model, and 80% of FIB-DM entities are subtypes of these 15 concepts.
The EDMC support and 800 data model downloads: At the beginning of the year, the EDMC offered to promote the FIBO Data Model on their website. It says, “many mid-sized EDMC members want to leverage the industry-standard but don’t have ontology tooling, databases, and human expertise in-house yet.” With FIB-DM Data Architects are no longer manually transcribing ontology graphs, copy and paste definitions. 800 users downloaded the open-source version of the FIBO Data Model. However, larger EDMC members, even with FIB-DM, architects still have to copy and paste their FIBO customizations and extensions manually.
Here is what CODT provides: An ontology-derived data model. On the left-hand side, we have an ontology graph. It has a bank which is a Depository Institution, the bank provides a bank account, and the bank account has a bank account identifier. On the right-hand side, we have the conceptual data model with the same entities, Depository Institution, Bank, Provides, Bank Account, “Identifies,” Bank Account Identifier. CODT transforms classes into data model entities. The subclass becomes an inheritance, also known as a generalization or subtype. Object properties transform into associative entities, and finally, class restrictions, domain/range determine relationships and cardinalities.
The current tooling imports are not fit for the purpose. Data modeling tools like Sparx E.A. and IBM IDA have a rudimentary input for RDF/OWL files, but the imports are a one-click black box. There are no options to configure the import and no diagnostics. The challenges are where URIs transform into entity names, and datatype properties become classes on their own, and class restrictions become anonymous pseudo-classes. There’s no import of the FIBO annotation properties. The problem is that the parsing-approach is not scalable. Traditional transformation parses ontology files. What happens is they open the RDF/OWL file, encounter elements of the ontology, and create elements of the data model, as they go through the source file. The parsing-approach reaches its limits with very large ontologies like the FIBO. Per default, ontology object properties transform into data model relationships. This transformation loses metadata for object properties with particular design patterns. XLB and other large Financial Institutions developed the rudimentary transformations. Everyone reinvents the wheel on their own. I invite you to compare FIB-DM to a vendor model or in-house transformation of the FIBO to see the difference. License the transformation that created the industry-standard model rather than to do it yourself.
Here’s the outcome of the transformation: Let’s start with package properties derived from ontology modules. The package name is the rightmost string in the ontology namespace. The Code transforms into the ontology prefix. A logical as a unique code of the package and all ontology classes properties with the prefix. For example, fibo-fnd-agr-agr become model objects in the agreements package. The URI, the Uniform Resource Identifier of the ontology, is traceability link to the source of the model object. The second part of this overview shows how CODT extracts properties, transforms, and adds them to the model.
Packages have extended attributes, and these are all FIBO annotation properties. They are carried over, transformed, and preserved in the data model.
Looking at entities, the Entity Name is the ontology Class Localname, converted from the Camel Case notation to LDM naming standard, basically capitalized with a space between the words. The Code transforms from the ontology class. It’s defined as Prefix, the colon, and the Localname. The Comment populates from the class annotations, RDFS comment, and SKOS definition. There are two particular tabs here, annotations and lineage. Entity annotations are the extensive FIBO documentation. Here’s the chart of the usage of standard annotation attributes. Very important, the Entity Lineage are ontology object properties that we carry over. It helps up to trace back to the original, in this case, is the class in the FIBO or other ontology. We preserve the Resource Name, the Localname, the prefix, the URI, and restrictions, and equivalent class axioms. The restriction is semantic beyond the capabilities of the entity-relationship model. We cannot model it in the data model but preserve it as a documentation item.
Another thing is intricate FIBO patterns or generally ontology patterns, for example, sub-properties. Here’s a graph of amounts in the FIBO. When we transform it, the simple proposition that object property transforms into a relationship goes wrong. On the left-hand side, here we see a fund portfolio, certificate of deposit, loan; they say they all are simple relationships. The correct transformation transforms them into Associative Entities. We preserve the object property hierarchy. What the FIBO defines that the Principal amount is a subtype of the Notional amount, which is, in turn, a Monetary amount. We want to preserve the hierarchy of our associative entities in the data model. To look at this more in-depth and that more cases of the wrong transformation, check out the article “why ontology object properties should transform into data model associative entities.”
The FIB-DM data model is two data model files, the Normative, and the Informative. The Normative model has the basic FIBO modules, Foundation Business entities, Finance Business and Commerce, and from the upper ontologies, Language, Countries & Currencies. This is in your Open Source core model, the FIBO 2018/Q4. FIBO/FIB-DM 2020 Q1 Production added Derivatives, Indicators, and Securities. Finally, FIBO/FIB-DM 2020 Development, these are further Informative model packages for Loans, Business Processes, Collective Investment Vehicles, Market-Data, and Corporate Actions and Events. New FIBO modules enter FIBO Development; they become part of the Informative FIB-DM model. After a while, once they’re all vetted, Informative modules become normative. That is a development cycle with the recommendation that you do the same with your FIBO extensions. Let’s say you do a module for Credit cards. You would start with it in your Development ontology version, and you would generate it into the Informative module. When you have finalized your Card module, it moves over to your Normative Enterprise Ontology and the Normative Data Model.
What we do is: We open both models the normative and informative added to our PowerDesigner workspace. With other modeling tools, we import both data models that ship with the commercial version. We expand the normative model and see 14 base packages and the central model diagram. It is a package diagram, and it shows the major dependencies, references from one package to another. At the top, we have the Upper Ontology content, the ISO-codes for language, currencies, and countries, and SKOS and a Specification Metadata for annotations. FIBO imports core with Foundation, FBC, and BE modules, and then the Full Normative content in green, Securities Derivatives Indicators. Of course, the full normative content depends on the FIBO core. Finally, in the FIBO Development Version are the informative packages, Mortgages & Loans, CIV, Market Data, Corporate Actions, and Business Process.
How does it all fit together, FIBO, Vendor models, In-house models for semantic information architecture? Our goal is to leverage. We want to create databases, java code, UML, and data models. Our method is to derive. We adhere to the industry-standard with the Normative models for FIBO Production. For the Enterprise Data Model, we consult FIBO Development, other Industry Standards, our in-house models, and vendor models. Out of these, we move informative content over to the normative side. We generate implementation models from both Normative and Informative.
For data architects, Robert d’Uzès advice to merge your Vendor models and your In-house models into FIB-DM. The vendor models have excellent value. Keep it and harvest the content! On the left-hand side, we see the FIB-DM entity hierarchy for Agreements. We want to adhere to the industry-standards, the 15 Concepts, and the subtype hierarchies. We adopt the FIBO/FIB-DM names and definitions. Now what we do is, looking at our In-house model, at our Vendor model, we identify indirect entity matches and synonyms. For example, in FIBO, it’s called an Agreement, and that is just a synonym for what the IBM model calls an Arrangement; hence they are direct matches. Security is also in other models; it’s in your in-house models. Just be careful to beware of hormones where you have the same name, but it means something different in the FIBO versus the other model. We merge entities that are not already in FIB-DM into the model. We identify the appropriate supertype. If you have an entity that is not already in FIB-DM, you figure out where in the hierarchy would it fit in. Finally number four, we merge attributes
from the vendor/in-house model and attribute the FIB-DM enterprise model.
Note another thing; the FIBO Data Model correctly defines Financial Instruments as a subtype of the Contract. That is an Agreement – it’s not a Product as seen in some vendor models.
Concept maps, the FIB-CM, directly links to the data model. On the left-hand side, we see a concept map from the 15 Concepts we see a Depository Institution, which has an Identity as a Stock Corporation. It has a Legal Entity Identifier. It has an address, and so on. It also has for U.S. institutions, an FDIC certificate number. A registration authority registers it and it is within the registry. We have a direct correspondence from the concept map to the data model. To learn more about scoping a data model, the 15 Concepts, see the link to “Semantics for Finance Users.”
This table compares the General Public 3.0 License to the Commercial License. It goes by topic, details, the current license GPL 3.0, and the Jayzed customer license: Release, Domain, Distribution, number of entities, and Normative and Informative packages and the resources. In a nutshell, the open-source license requires you to copyleft, which is to license your derived models to the public. With a commercial license, you keep FIB-DM extensions private. However, you are not allowed to publish your derived models if they contain the content from the Full Data Model. Likewise, all educational materials are subject to copyright. You are welcome to download them from the website and distribute them, but you’re not allowed to change them. With a commercial license, you’re free to modify, translate, edit, and you can even lift off diagrams and images, as long as they remain within your organization.
In summary: The Financial Industry Business Data Model is the most comprehensive enterprise reference model with 1968 Normative and 4563 Informative entities. It has the superior design of a semantic data model. It has extensive documentation of the industry-standard ontologies. It has a full lineage from the data model into the ontologies, including links. Semantic Enterprise Information Architecture has the same names, definitions, design patterns, across the enterprise. The ontology at the apex with business-friendly Concept Maps and derived data and object models. It unifies semantic
and conventional data management.
You have full transparency for your FIB-DM evaluation. Explore the PowerDesigner model. Download the FIB-DM core Open-Source version. Study the extensive education resources Examine the full 2020/Q2 content; it has listings of all 4 500 entities and the definitions. Review the license maintenance agreements and pricing.
Part Two – Transformation
Version 1.0, “Atlantic,” meets Microsoft Power Query. In this stack, we show how the actual transformation works. You can see it in MS-Excel using Power Query and the “M” language.
The patent-pending technology that created the FIBO Data Model: As we have seen in the first part
of the webinar video, the old OWL-file parsing approach doesn’t produce usable data models, and it cannot cope with very large ontologies. ETL Inspired the new approach. It creates high-quality models, and the technology is fully scalable, and configurable. With ETL, we start with RDF/OWL, the ontology:
We Extract Transform and Load into the data modeling tool. Behind the scenes are Metadata Sets, MDS. They are keyed records holding properties for all objects of the model. For example, we have a Metadata Set for entities with 4,568 records. There are three types of metadata sets: The ontology metadata sets; they hold the record extracted from the ontology platform. Entity-Relationship metadata sets transform the ontology metadata into entity-relationship metadata. PowerDesigner (or any other tool) metadata sets are ready to load into the modeling/development platform.
The metadata sets are the novel approach to model transformation. Let me define it: A Metadata Set is metadata stored in data sets. We have all seen that before; think of System Tables on the relational database. They are data sets that hold metadata about a database table or database columns.
The CODT metadata sets are an isomorphic representation of ontology, entity-relationship, and data modeling tool-specific metadata. The transformation is a two-step process: Step one transforms the
ontology metadata into generic entity-relationship metadata. The second step transforms the generic metadata into tool-specific metadata. The same generic metadata set is the source for both PowerDesigner and Sparx E.A. metadata sets.
Here is a look at the folder structure. CODT installs in the home directory. In this case, I named it FIBO second-quarter production. It installs with the main a worksheet, CODT that has VBA and macro code and three subfolders for Ontology, PowerDesigner, and Entity-Relationship metadata sets. If I open the Ontology Source metadata set that has the workbook, and also all the SPARQL queries, and the output, the raw query results.
I open the Ontology and the PowerDesigner MDS. On the top is the ontology. Here’s the PowerDesigner. What we see is, basically every metadata set is a sheet in the workbook. The top here, we look closely at classes. Likewise, here for the PowerDesigner MDS, we have entities, inheritance, inheritance link. Basically, the transformation uses this ontology metadata to populate the PowerDesigner metadata that PowerDesigner can directly import as a data model. The classes here, we see the Class Code, Qualified Name, Namespace, SKOS definition. In our target, we see that the Class Code has become an Entity Code. The Name, actually that is a transformation rule: We take the Localname here, in this case, “BoardAgreement,” and we “uncamel” the name to comply with the Logical Naming Standard. The comment populates from the SKOS definition, and the URI is simply the Namespace plus the Localname.
Let’s look at the system specification. Figure 2 of the patent application shows the UML Component Diagram for CODT. The two external systems, the Ontology Platform and the Data Modeling Tool, and internal components: Extraction of the ontology meter data set. Transformation is the E/R metadata set. Load is the PowerDesigner metadata set. We see the two interfaces. So, Extraction uses the Ontology Platform SPARQL Interface. Load uses the data modeling tool, in this case, PowerDesigner import interface. We have a Configuration that we have already seen in the CODT metadata set. Microsoft Excel, in my opinion, is the tool of choice to view and analyze tabular data. Every data architect/data Modeler has Excel and knows how to use it. Therefore, MS-Excel is a fast prototyping tool for the CODT metadata sets. It also makes the transformation easy to deploy. Here is a table that shows the components, Extraction, Transformation, Load, the corresponding Ontology, generic E/R, and PowerDesigner metadata sets, and the Excel workbooks that you find with the installation. Any platform and programming language can implement the system, the metadata sets, and the method.
The Extraction works with SPARQL queries. Here this is the SPARQL query to extract the Class metadata: It selects the class, the qualified name, namespace, and definition. It filters out the unnamed classes. It may vary depending on the ontology platform and dialects. If you don’t have a filter “smf:isBound” for the namespace and an “afn” function, look up how to filter in your SPARQL dialect. You want to
filter out; you do not want to extract, unnamed classes. Here at the bottom, we see the result. In my ontology platform, TopBraid Composer, it extracts results into a CSV file. We see here our SELECT variables, the license document, a class, the QName, the namespace, and the SKOS definition.
Here’s our Ontology metadata set again with the classes tab. Under “Data” in Excel, we have “Get and Transform Data.” That lets us source queries from various data sources.
I look at the queries and connections defined in this workbook Ontology MDS, and here we have 21
Ontology MDS queries. They populate the spreadsheet that constitutes the interface. In other words, the data that the E/R Metadata Sets take. Here at classes, we can see the number of rows loaded, and we have a little preview of the ower Query. If I edit this query, t invokes the Power Query Editor, which sows our data source the classes CSV extract.
Power Query lets us do different transformation steps. The first step of the Transformation happens in the Entity-Relationship Metadata Set. We have the entity tab, and the Code sources from the Ontology Metadata Set, the class QName. Prefix and Localname break up the code, and then formulas transform the local name into an Entity Name as per the naming convention. We use an “uncamel” function that breaks up the local name, in this case, simply inserting a space.
In the second step to the tool-specific metadata set, we convert the generic Entity-Relationship MDS into a data modeling tool-specific metadata set, in this case, PowerDesigner can directly import the Metadata Set. For entities, the transformation is a simple copy of E/R.
Finally, we can load the PowerDesigner Metadata Set, the Excel spreadsheet directly into the modeling tool. In PowerDesigner, we can define excel imports. There are 24 Metadata Sets, for entities, inheritances, data items, attributes, associative entities, relationships, packages, and annotations. Each of these imports has a name and the imported file. It’s the CODT home directory that we looked at with the entities. The mapping description shows that what we’re mapping. Each Metadata Set column maps to a power designer meta-model object. It facilitates easy import if we use the PowerDesigner meta-model object names as our columns. Here simple, we have a table Entity that maps to an Entity model object. The properties of an entity are the Code, the Name, Comment, and then a bunch of extended attributes for the FIBO annotations.
Stacked queries and ETL, together, master the transformation complexity. The screenshot shows
Power Query dependencies on the left-hand side we have Ontology Metadata Sets, on the right-hand side Entity-Relationship Metadata Sets. They are from the two Excel workbooks, the Power Query data sources, and tables. Interface MDS are queries and worksheets that subsequent Metadata Sets use as a data source. For example, the E/R Association Supertype Subtype Metadata Set is a set that the Power Designer MDS use as a source. For their population query. There is, in this case, a hierarchy of Intermediate Metadata Sets, association subtypes, going up to Properties, Active/Passive, and to the E/R Metadata Set.
Some statistics about the CODT excel application. The MDS folders that we looked at, the
whole queries that provide the interface for Metadata Sets in the next transformation step. We have
that for the ontology,21 Interface Metadata Sets, some 20 Intermediate ones, and here Entity-Relationships, 24 and quite a lot Intermediate Metadata Sets because that transformation is rather complicated when it comes to Object Properties. Finally, the data model, in this case, PowerDesigner Metadata Sets. It’s generally straightforward; all we have to do is some renamings from the generic E/R, so for example, in the generic E/R, we call it Subtypes, PowerDesigner calls it Inheritances, Sparx E.A. calls it Generalization. All in all, there are 150 Excel sheets and PowerQueries. CODT is a White Box, an open book. The Excel version fully discloses all worksheets queries and the Visual Basic for Applications code. New users and operators can generate the data model import sheets with a single click, using default configuration settings. The Data Architect uses CODT as an ETL and Development Platform, for diagnosing results and tweaking transformation rules to match your modeling and naming standards.
VBA developers may secure, in other words, hide and block, the data sets and fully automate extract and load or port the application from MS=Excel to a java program or put it all server-side.
CODT embodiments: An embodiment is a way to build the invention. For Table 14 of the patent application, I added blue for the Excel embodiment. It breaks it down in different ways to build the connection to the ontology source, the transformation system, and the data model. For example, to create a connection, we can directly encode it to connect to an RDF-Store or run the queries in a batch. Instead of using MS-Windows, we can move the application server-side. For the application type, here it’s on MS-Excel; we can encode it in in our ETL environment or as a Java or C program. The user interface in the Excel application is a White Box – everything visible, but it can be a guided user interface. For example, you can encode a configuration wizard that takes a user through options and parameters. Besides a Conceptual Data Model, we can generate other types of models, Logical, Physical, Object. Instead of manually importing worksheets into PowerDesigner, we could use a load directly using the data modeling tool or repository API. For example, ERWin doesn’t have a metadata import from CSV or Excel. If you want to load directly into ERWin, then you would use the ERWinn Application Programming Interface.
In the Reverse example, we extract from PowerDesigner Entities. Our example is a Logical Model created from the New York Stock Exchange Open MAMA messaging API. We have entities like the Auction, Order Book, Quote, Referential, Security Status, and Trade. PowerDesigner, and every data modeling tool, can create list reports. The list report is simply a columnar report with Entity properties. The PowerDesigner Entity List Report has Code, Name, and Comment. The Metadata Set just sources
The Reverse Mode is an exceptional embodiment that doesn’t fit in the previous table about the different ways to implement CODT. The CODT Metadata Sets they are by design bi-directional. In other words, they work in both ways.CODT can reverse-engineer ontologies from data models. The first step is to extract. The way it works is that the data modeling tool generates List Reports, matching the data modeling tool-specific Metadata Sets. Then the PowerQuery populates the Metadata Sets, performing simple data cleansing. In the Transform step, the Entity-Relationship Metadata Sets populate from Tool-Specific Metadata Sets. The Ontology Metadata set populates from the Entity-Relationship Metadata Sets. Power Queries and Formulas break the data down into Triples, and then we load these triples onto the ontology platform using SPARQL CONSTRUCT or Bulk Inserts.
that list report – just like we have seen for the Ontology Metadata Sets importing raw ontology metadata.
Once we have it in a Metadata Set, the second step is to transform in the Entity-Relationship Metadata Set, so the E/R Entity MDS populates from the PowerDesigner MDS. There are minimal changes and transformation here. Prefix and URI are Configuration Settings – they must match what we designate
in the ontology as a prefix and namespace for these reverse-engineered classes. In this case, I call it “fib-omdm.” The local name transforms the entity name. It’s simple a Camel-Coding, we eliminate spaces and capitalize string components. Finally, the resource name is a concatenation of Prefix delimiter and Localname.
Finally, to load into the ontology, we have our ontology metadata set, here the class set, and we see that a query populates the Class Metadata Set from the Entity MDS. It is the same transformation in reverse. The class name is from the Entity Name, and the Namespace is a constant that we have defined plus the Localname, and then the SKOS definition sources from the Entity Comment.
Besides, we have several metadata sets that have a “T_” for Triple as a prefix – They break down
the class record into triples of Subject, Predicate, and Object.
The triplets, if we look closely, match the SPARQL SELECT Joins. Here are the triplets to create
classes. We see the Subject, Predicate, Object. For example, the Auction class is an RDF Type of OWL
class. Likewise, the triples for the SKOS definitions, The auction, “skos:definition,” and then we have the definition text. If we look at the OWL classes query again, the one we use to extract Ontology Metadata about classes, we see here is our join “class a owl:class” that is the same subject-predicate-object that we now use to create classes. Likewise, the join class SKOS definition and the variable “skos_definition” corresponds to our SKOS definition triple set.
Then we can take these triples and assert them in the ontology platform. We can do this either using our Triple-Store Bulk Insert or, in this case, SPARQL CONSTRUCT. A CONSTRUCT statement wrapped around the SKOS definition triplets. When we execute this query, we see here the classes created in the ontology tree, and we see for an individual class, here the auction class, that the definition has populated.
This is the bi-directional mode: Transformation enables Semantic Enterprise Information Architecture. We have our ontology, the FIBO, maybe other industry ontologies, our in-house ontologies. On the right-hand side, we have FIB-DM, the data model, the Enterprise Data Model, and Project Data Models. We generate data models from industry, domain, and our proprietary ontologies. We design conceptual models in RDF/OWL. We reverse-engineer our data models to extend the Enterprise and Project Ontologies.
The United States Patent and Trademark Office acknowledged the Utility Patent Application for CODT. It is quite comprehensive: 23 drawings, 19 tables, 35 pages of specification. This non-provisional patent application fully discloses the invention. In about half a year, the USPTO publishes the pending application. Twenty claims comprehensively cover the Method, System, and Non-Transitory Storage Medium, and all embodiments. Once granted, the patent protects CODT Licensees and generated models, including the FIB-DM. The patent application enables me to share the inner workings with you in POCs.
The license agreement for CODT is similar to the FIB-DM license agreement. FIB-DM licensees can purchase CODT as an add-on. New users can license FIB-DM and CODT in a bundle. There is no standalone CODT license. Software Deliverables are the MS-Excel CODT workbooks. It’s a Site License – it doesn’t limit the number of users. You are free to modify the software and create new models for internal use. Just like with the FIB-DM license, you must keep derived models and changes to code confidential. The license includes Education Resources. You are free to modify, translate, edit, even lift off images and diagrams, as long as they remain within your organization. Finally, the license covers the intellectual-property. In other words, license grants you rights to the whole space carved out by the CODT patent. You can leverage Metadata Sets, queries, formulas algorithms disclosed in the source code, the specification for internal development. But you must not share or sell these embodiments.
Pricing: Licenses, just like for FIB-DM, are priced by institution size. I use your EDM Council Membership-Tier as a segment so, if I look at buy-side, for banks or investment companies that are the assets under management. The EDMC defines three tiers: Up to $50 billion in assets, 50 to 200, and above 200 billion. The add-on price for existing FIB-DM licensees is two-thirds of your data model license. So, it would be $10,000 for a Tier-C bank. The bundle price for new users is 1.5 times the standalone FIB-DM license price. Central Banks, Multilateral Lenders, and other qualifying Financial Institution get the Tier-C price without further discounts, irrespective of the asset size. Large Commercial Lenders and Investment Companies can get early-adopter or the U.S. Stimulus Discount.
About the offer for Prove of Concept: You can try, test, and evaluate CODT free of charge. The scope should be confined to CODT itself – no no not the whole SEIA, a colossal enterprise transformation. CODT is a central piece of the puzzle. The POC is about CODT only. Well, FIB-DM already proves that CODT creates a superior data model. You evaluated FIB-DM. The objective for this POC is to Prove the Concept that CODT also works for your FIBO extensions. You test the application and evaluate the intellectual property. The materials are the Excel Workbooks, the Educational materials, and the Patent Application. That is for your Legal and Compliance Department to assess. Training and support: There are two days of training via online video conferences and three days of support, like Q&A sending back and forth Metadata Sets via email.
For your POC team, I suggest you should have a Management, Finance, or Business Sponsor. It would help if you were authorized to sign non-disclosure and license agreements, which would typically be a director level at U. S. banks. An Ontologist with an in-depth understanding of the FIBO and your in-house ontologies because you are the one to adapt the template queries to your SPARQL dialect. You produce the raw ontology metadata. The Data Architect has experience in Enterprise Reference Models. You configure CODT to match your naming standards, and you load Metadata Sets into the data modeling tool. Finally, a developer or MS-Excel Power Users with experience in VBA, Power Query, and the M-language. You can troubleshoot complex formulas, queries, and explore the other technical embodiments.
About the technical requirements and preparation: I recommend a power PC, mine has 32 Gigabyte RAM, it’s on Windows 10, 64 bit, and it has MS-Excel and MS-PowerQuery. In particular, the data modeling tool is a bit of a bottleneck. For huge models, no matter whether it’s in PowerDesigner, Sparx, or ERWin, you need a very powerful PC. Likewise, the transformation in MS-Excel the three Metadata Sets takes around 10 minutes. This can be considerably slower on a weak PC. The Ontology Platform should have a SPARQL Query User Interface. I use TopBraid, but Protégé or any RDF-Store or Semantic Endpoint should work.
For the data modeling tool, the reference tool is SAP PowerDesigner. If you have ERWin or another modeling tool, I recommend using a PowerDesigner trial first. You can download a PD trial. Import the data model, and then later, you may customize CODT to import into your tool. The FIBO itself, it should be loaded onto your Ontology Platform before the POC. Try the entity queries and reproduce the raw metadata. This approach is around to get around any differences between different SPARQL dialects, so try out the query beforehand. Your proprietary ontology should be an extension of the FIBO. In other words, it should import the FIBO. So, make sure to include FIBO modules and have a Prefix defined for your Namespaces. Here’s an example for the Bank Ontology. There’s a Prefix and the URI. The entity query must return FIBO alongside your classes with the Prefix. That’s how you can test it.
Typically the Proof of Concept has a six weeks timeline. Two weeks for preparation and kickoff and in four weeks pretty intense proving the concept. For that reason, POCs are rolling with maximal two banks at a time. In other words, I can support two banks at a time, and once one has finished, then the next one can engage in the POC. Two weeks are for introduction into CODT and transforming the FIBO as a Proof of Concept, and then we repeat that transformation exercise with the addition of your proprietary ontologies. You can explore configuration changes and other embodiments.
Conclusion: The Semantic Center of Excellence and Ontologies must not become another Silo. Our vision is Semantic Enterprise Information Architecture, with the ontology at the apex and derived implementation models. The FIBO is the industry standard, and FIB-DM is the superior industry-standard data model. CODT leverages the ontology for data management. Copyrights and Patents protect your investment. Well, so let’s discuss a CODT POC. Just send an email to firstname.lastname@example.org, and we can have an overview and discussion with your questions and answers. Now, I cannot just email you the Excel sheets – you do need a team, and you need an executive sponsor to sign off on the non-disclosure agreements. You find further resources on the FIB-DM website, the YouTube education channel, and follow the LinkedIn showcase for news updates and discussions.
Well, thanks, and have a nice day.