Getting the Knowledge Out
How do you know — anything? Chemicals and electrical impulses splash around in the brain, and voila: we understand the meaning of life, the universe and everything. We have looked at how synapses connect neurons, and how taxonomical and other associations connect concepts, but is it even possible for a computer to understand the complex meaning of relationships between information objects: to know something?
The clever ways we have devised in the information age for delivering knowledge to users are largely based on the patterns of words, books and drawings, all of which have been around for years (thousands). While audio and video recordings are important advances, they are really nothing more than a new encoding for words and sequences of images. The papyrus and parchment delivered exactly the same amount of meaning, at a lower volume/velocity, as web pages, e-books and digital content management systems.
While the accessibility of arbitrary quantities of information has skyrocketed, the amount of effort needed to understand it (derive meaning), specifically chemicals and electrical impulses splashing around in the brain, hasn’t changed much with the advent of computers. Translating lists of numbers to visualizations does reduce the brain power required to interpret statistics, and is important in society and business. But computers are capable of much more when bits and bytes are converted into meaningful and actionable knowledge.
Meaning in the Model
Many technical people do not understand the meaning of “meaning”, “semantics”, nor “models“. This disheartens a semantic model bigot such as I. But I know how scary semantics, Natural Language Processing, AI and cognitive computing can be. We have frankly seen years of failures and disappointments in “semantics” projects and technologies in enterprise settings, but new tools for modeling meaning are appearing on the horizon. There are several commercial products that help designers and architects map the structure of objects and their associations, and use these maps to deliver “actionable” knowledge, the core component of which is contextualized meaning. These products do that by providing structure and consumability to meaning, which is, by the way, the only path to actionable knowledge. I’ll explain more about consumability in my next post.
In an organization with complex computing systems, there are several cases in which models of meaning can bring very specific value. Consumable sources of meaning, such as ontologies, can be used to empower citizen developers for self-service analytics and other data intensive capabilities. Advanced Master Data Management may be implemented to use meaning as a basis for better classifying data. Enterprise Knowledge Management projects, it goes without saying, must be focused on meaning, and on using an understanding of meaning in context to better classify information at compile time, to better deliver it whenever needed. We’ve come a long way from stone tablets to databases and advanced document and web search, so I’d like to take a minute to examine the underlying model most commonly in use today.
In most operating business information models, database tables, rows and columns implicitly represent associations, though the exact nature of the associations may be undocumented. Associations in a database table are defined by the records in each row and named attributes in each column. The name of the table tells the symbolic association that is reflected in the rows: all the values in each row are associated the named instance at the head of the row. That could be the name of a person or product or the ID number of an order. Attribute relationships are reflected in the column names: every column of information shows an attribute of the head of each row. Two important elements or meaning are missing from the relational database model: explanation of where information elements fit into the overall enterprise hierarchy or ontology, and how information objects are associated with processes or capabilities.
Both instance (row) and attribute (column) are hierarchical type associations. The added dimension in relational databases with primary and foreign keys enables us to express functional relationships of any arity, cardinality or modality (one-to-one, one-to-many, many-to-many). These are represented in the gray-scale image above as lines with different endings depending on the relationship. The relational model is a good representation for holding information on functions or transactions in day-to-day business, such as “orders” and “customers”, though it doesn’t necessarily reduce the amount of brain power required to interpret the data, though it provides needed context.
Before relational databases, we had flat databases. Flat database models are conceptually one dimensional. A single dimension reflects only hierarchical or physical relations. Relational models add the conceptual dimension of function, making them conceptually two-dimensional. Object model databases can reflect even more dimensions.
The Object Model and Big Data
In the object database model, we expand the number of data elements, columns, or rows that can have explicit relations. This expansion builds on the relational model by letting us add any kind of logical or conceptual association in the same way that web pages with hyperlinks let us associate any snippet of text with further information. The illustration below depicts multiple database tables with object associations or relations between multiple objects in multiple tables. Conceptually, this model, as it grows to include all the data objects and permutations in a company, may be difficult to visualize as a database application that a business could use. This complexity is one of the reasons for the rise in Service-Oriented Architectures and microservices that enable us to manage complexity in bite-sized chunks. This is the opposite of monolithic apps that do everything for us, however, the similarities between object models and the distributed nature of human brain information structure and processes are important to consider, especially when building more intelligent processes.
Let’s get right to a pragmatic use of this technology. In the order entry system example we all know and love, we often see a one-to-one relationship between a product and its unit cost, though this may change for the duration of a sale. In the airline industry, companies dynamically price the seats on each plane. A passenger purchasing a seat one minute may get it for a different price than another passenger buying an equivalent seat a minute later. The formulae for determining pricing are very complex and dynamic. To manage complex processes like this, tight coordination between data objects and the process that use and modify the objects is critical.
Now, you are an airline executive and you want to examine the results of your dynamic pricing program to adapt the formulae in an effort to deliver higher profits and more sales. Standard relational database models tend to lack the expressiveness to show the complex relations between sales, profits and flights. Expressiveness is a major strength of the object model. While Object databases, such as Gemstone, never achieved significant market penetration, very similar models in Graph Databases, Document Databases, and other “Big Data” platforms bring all the same value without the penalties of earlier ODBMS implementation strategies.
The Client/Server and n-Tier Models
We have been talking about strictly software models in our discussion of DBMS formalisms. Client/Server (C/S) and n-tiered architectures are combinations of software and hardware models. The client is a computer that gets or displays information, and the server is the computer where the information is stored and/or processed. Typically, the client is a workstation and the server is a more powerful mini- or mainframe. The reason organizations have raced to adopt the client/server and n-tier models is that they are efficient ways to make data accessible to many users. Adding mobile devices as clients makes the hardware aspect even more critical to the system designer.
The server is in the back end. The data resides there, available for authorized clients. Think of a bank with many tellers who need to get account information for account holders. If all account information is stored on a central server, then any teller who has a computer client program that is connected to the server can be used to make inquiries about accounts. Besides inquiries, clients can be set up to perform transactions in real time. If the number of transactions is large, then a system can even be set up to store batches of transactions so they can be processed when the bank is closed or the number of concurrent inquiries is lower.
n-Tier systems often have middle-ware components that perform complex processing, transaction processing, or other applications to reduce burdens on front and back end machines. This specialization is a thing the brain also does, and that’s good, but it doesn’t bring users any closer to understanding the meaning of the information being processed. Many DBMS packages – flat, relational and object – are designed for n-tier architectures. These can be powerful tools for businesses that need to connect many users to a shared data source. Now we need to bring meaning to the architecture.
The Open Group defines meaning in Archimate v.2 architecture modeling: “Meaning is defined as the knowledge or expertise present in a business object or its representation, given a particular context” (Archimate Meaning). The illustration at right describes the links between content and meaning. This definition correctly states that meaning is context-dependent. As an example, the meaning of Suzie buying a new device from Amazon means that:
- Suzie decided that the value of the device in her life was worth the expense and
- she would soon receive the device and benefit from using it.
- It might mean she’ll track the package so she can be home when it is to be delivered.
For Amazon, the same event means something completely different:
- Amazon successfully sold a product that an Amazon planner thought was the kind of thing Amazon customers might want
- Amazon will have to process, pick and deliver the order, or notify a partner to do so
- Amazon may have to adjust inventory counts, and may display low counts on the website
These two completely different takes on meaning are not incompatible. In fact, there are people at Amazon who care enough about the “Customer Journey” and “User Experience” that they try to empathize with Suzie and redefine meaning and process at Amazon in a way that somehow accommodates Suzie’s frame of meaning.
Intent and Certainty
The Open Group goes on to associate data objects with human intent: “A meaning is the information-related counterpart of a value: it represents the intention of a business object or representation (for example, a document, message; the representations related to a business object). It is a description that expresses the intent of a representation; i.e., how it informs the external user” (Ibid). Archimate 3 goes a step further to associate meaning and value as shown in the example below.
This diagram ties processes to meaning, largely as “outcome”. But why do we need meaning? Humans are perfectly capable of interpreting their own meaning based on the data. My key point is that automation can do more, thus relieving more burden from the humans, by reducing the amount of interpretation needed, and performing appropriate (and low risk) tasks automatically. Furthermore, the messaging or notifications that “systems” spit out, lacking context, may succeed in delivering the “Being Informed” value shown above, and completely fail to deliver the “Peace of Mind” or “Certainty” that humans crave. This could be a matter of messaging that should be better crafted by User Journey specialists. Or, it could be a capability of more intelligent systems in the semantic enterprise.
The concentric rings in the beginning of this post shows ideas, and their symbols as the center of meaning. The blue triangles at right show a more nuanced model of meaning in which an idea is described as a property. In this diagram, concepts are tied into Domains through categories.
Before we consider the meaning model and its ability to represent more relations and more complex relations, consider the hundreds of thousands of perfectly good database systems in use in hundreds of thousands of companies around the world. They use flat spreadsheets and relational model database technologies that work. The fact that they work is a critical piece of information. The last thing a die-hard pragmatist would suggest is to throw out something that works perfectly well. On the other hand, the object model is ideally suited for that class of problems for which more complex relations are required to build a better mousetrap (or expert system, or decision support system…). This is where the model of meaning, and meaning modeled in the enterprise can bring significant value.
Perhaps I should have begun with the sub-field of computational linguistics, but every time we type a word in a system, whether a document or database or web page, language encounters computation. Wikipedia tells us that: “Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective.” Plato at Stanford goes a bit further to suggest that: “Computational linguistics is the scientific and engineering discipline concerned with understanding written and spoken language from a computational perspective, and building artifacts that usefully process and produce language, either in bulk or in a dialogue setting.”
The Architecture of Meaning
There are two architecture components that I believe will appear most frequently in the semantic enterprise:
- An enterprise meaning model and repository with
- A top-down taxonomy of the entire enterprise describing meaning in the context of the business
- A controlled vocabulary with terms, definitions, KPIs, formulas and identified stewards
- References to all structured and unstructured information assets
- Semantic metadata describing topic and other core elements of meaning for all assets
- Sensitivity metadata to safeguard private information assets
- Process metadata that associates specific categories of information with business processes
- Meaningful links associating asset metadata and the enterprise taxonomy
- Bots that automate asset discovery and semantic classification
- Able to extract meaning from structured (eg. DDL) and unstructured (eg. text) information
- Using NLP to, as accurately as possible, infer the topical content of the asset
- Able to operate at the right granularity to deliver maximum value to knowledge workers
In future posts I will describe these architectural components more thoroughly, and show how they are an evolution in the best of current IT practice (microservices, n-Tiered architecture, Big Data…).