Natural Language and Modeling Intent in EIM

  • Maze of intent

Natural Language and Modeling Intent in EIM

Joe Roushar – May 2019

Completed Game of GoCompetitive Advantage

Is your organization ready for a future with significantly more advanced and competitive information management strategies? Are you ready for the inevitable turbulence that accompanies changes: even those that are mostly evolutionary, but some revolutionary? It’s time for Enterprise Information Management (EIM) to grow up and assume its rightful role at the center of competitive advantage. EIM refers to policies, architectures and practices defining how data is acquired, processed, stored, mastered, enriched, cleansed, transformed, distributed, consumed and exploited. While choices about relational and “Big Data” strategies are critical in the modern enterprise, I will keep this discussion a notch higher level, discussing overall strategies and the roles of AI.

Among the credible end-of-year predictions was a thoughtful article by Phil Fersht. I think he’s spot on with the “integrated automation” focus, and nowhere is this more important than in automating enterprise information management as a core capability, rather than a bunch of disparate info silos managed, analyzed and understood separately. There is a related opportunity that many predictions seem to miss: using EIM to build “intent models” that can make all automated processes smarter. The intent models can reflect both customer intent and user intent: the information and processes users need to succeed. These models can simplify adding AI to the mix. Approaches described as intelligent ERP and or intelligent CRM are likely to involve more siloed, narrowly-defined architectures. The truly intelligent enterprise will be much more inclusive of information assets that cross boundaries, and intent models that reflect both customers and internal workers.

For this post, I have liberally borrowed insights from Graham Fyffe – respected architect I met at Medtronic, Inc., and germane articles in Harvard Business Review and other sources.

Specific Predictions on Competitive EIM

In Gartner’s “Predicts 2017: Artificial Intelligence”, there are three important predictions that I find salient to the question of competitiveness in EIM:

  • Through 2020, organizations using cognitive ergonomics and system design in new artificial intelligence projects will achieve long-term success four times more often than others.
    • Why care? If projects (investments) are so much more likely to succeed with good AI, why invest in projects that don’t have it?
  • By 2019, startups will overtake Amazon, Google, IBM and Microsoft in driving the artificial intelligence economy with disruptive business solutions.
    • Why care? The underlying reason for this phenomenon is the innovation agility of startups. Choose your partners wisely!
  • By 2019, artificial intelligence platform services will cannibalize revenues for 30% of market leading companies.
    • Why care? This prediction focuses on the AI “Platform”. This is consistent with our understanding of the overall benefits of the Platform Revolution.

Analysts LogosCareful reading of the cited articles and others point to the need for better decisions at all levels of the business (see ABCs). Better decisions want better processes and data. Semantic models of intent behind each type of decision can make this possible.

In the corresponding 2018 predictions, Gartner says“AI and machine learning strategy development/investment is already in the top five CIO priorities.” Accenture also cites this and explains that: “Increasingly, businesses have invested in building AI-enabled applications. Examples range from business chatbots and question answering systems to more complex systems, such as self-driving cars, virtual doctors, and AI-assisted decision-making platforms.” They point out that their clients are increasingly combining search with Natural Language Processing (NLP), machine learning, and big data to add AI capabilities to information management strategies and the new and existing systems that implement them. The AI component that can serve as a foundation for NLP based interaction, machine learning and big data analytics is a robust semantic model of intent.

Integrated BusinessA Case for Holistic

Why is it so important for EIM and AI to transcend functional areas, lines of business, divisions, regions and apps to deliver more sustainable competitive advantage? Because, no matter how distant the data seems to be from the customer or the sale, it can contain clues to help good analysts find ways to reduce costs and/or increase revenues.

Phil Fersht describes “boundary-less organization where there is only one office that matters – and that is the office that caters to the customer” (ibid). He suggests that the path to this utopian vision includes:

May I add:

  • Semantic models of intent that “know” why in addition to what.

My world view comes partly from being in the IT Architecture business for a long time and having experienced many of the trends and technologies that you talk about in your blog. I have used many architectural and programming styles and patterns. Over the years I have become convinced that most process centric architectures do not stand the test of time. Part of the early genesis of SOA came from early attempts at inter-application cross-platform communication in standards like HL7, EDI and CORBA an output of OMG. I watched CORBA collapse under its own weight and I would contend that some other weighty Web Services standards are also well down that path.

Lake in the AlpsWhat I have seen survive is the “information” that applications exchange (I hesitate to use the word “share”). The format for that exchange or even the semantic meaning of the information is often derived from the context of its use, so most “meta data” schemes fail to capture meaning and fail to empower any but the most savvy users and technicians. This is where I am a fan of some of the concepts in “Data Lakes”. In most Data Lake models the creation of “meaning” is placed on the consumer of the information and often meaning is only derived by combining information from many sources that know nothing about one another. This phenomenon suggests that intent models for each domain in the enterprise should be interconnected.

Information is more agile than process

Almost by definition, information is more agile in process. Information is easier than processes to recombine into different patterns to achieve business insights and transaction patterns. I cannot emphasize enough just how important this is at an enterprise architecture level. The time and cost of making process changes, even with iBPM and other Low-code platforms, is higher than the cost of imbuing information with more power and flexibility. Implementing AI solutions as part of the strategy compounds the time and cost of implementation, but done well, can lead to long-term benefits that pay back the initial costs. Furthermore, committing to a temporary or short-term process change is harder and more costly to unwind when business or industry drivers require new adaptations. Enriching data, especially using AI, can serve long-term interests and usually needs no unwind to support future enhancements.

Concept Graph for Business

Since the dawn of automation, we have known that things will change. Yes, some systems, especially mainframe banking and insurance systems seem to hang on forever, but on their periphery swirls constant change. Even business models and basic strategies change in the face of market forces, technology drivers or mergers and acquisitions. The turbulence seems to bring seismic disruptions to all, including the most stable companies. It is the onus of IT organizations to architect solutions capable of resilience in the face of change. Because, as stated earlier and explained to me by Graham Fyffe, information stands the test of time and information is easier than process for technicians and non-technicians to understand, information-centric architectures will be more successful – thus more competitive. Information is the balanced diet of business and the oxygen of agile innovation.

This ease of use and ease of understanding applies even to the protocols and notations for describing information versus describing process and services. Consider the notations of process; BPMN, BPEL and even UML at a general level. I would contend that ALL of these are complex and heavyweight (real world processes are hard to define). Contrast this to simple ERD’s or even simpler tabular spreadsheets than can be used describe information. Once we get into services (at a technology level) things get even worse – things like XML, JSON or SOAP (ironic since the S stands for Simple!) turn out to be horribly complex in practice, and made even worse by adding meta language such as WSDL. Perhaps we cannot easily eliminate the complex standards, but we can encapsulate them and hide them from everyday use (even from developers) by incorproating intelligent processes to support advanced data discovery and movement.

In the end complexity is the enemy of agility and information centrism is always going to be simpler than process/services centrism.

The Baker Library at Harvard Business School in BostonFolks from the Harvard Business Review in May-June 2017 reported that “on average, less than half of an organization’s structured data is actively used in making decisions—and less than 1% of its unstructured data is analyzed or used at all. More than 70% of employees have access to data they should not, and 80% of analysts’ time is spent simply discovering and preparing data.” Has this improved much in the last 24 months? No. What would increase an organization’s competitiveness? By the numbers may I suggest:

  • 80+% of an organization’s structured data should be actively used to support decisions
  • More than 10% of an organization’s unstructured data should be actively used to support decisions
  • Less than a handfulof employees should have broad access to data, and all data should be better controlled
  • No more than 20% of analysts’ time should be spent discovering and preparing data

Again from the HBR: “Having a CDO and a data-management function is a start, but neither can be fully effective in the absence of a coherent strategy for organizing, governing, analyzing, and deploying an organization’s information assets.” I have been addressing this in this blog for some time with the posts:

Much of this is plumbing – not the sexy part. The predictive models and dynamic dashboards are critical at the same time as being very appealing and fun to trot out to impress people. But without the plumbing, the water going into the beautiful basin may be unreliable. The plumbing is vital to data completeness, accuracy and system performance. HBR further suggests: “ensuring smart data management is the responsibility of all C-suite executives, starting with the CEO.”

EIM for Defense and Offense

To determine the outlines of a competitive EIM strategy, begin by determining the primary purposes of specific types of enterprise data in delivering specific outcomes: ask how do we exploit data to achieve specific objectives? These data exploitation assumptions can serve as the foundation of strategic data management. There are trade-offs between “defensive” and “offensive” data strategies that can be used to strike a balance between controlling data content and access through data standards, defined KPIs and governance, and increasing flexibility by enabling decision makers, “citizen” developers and analysts to access broader swaths of data more easily.Defensive EIM

The HBR article suggests data defense aims to minimize downside risk by focusing on data integrity and compliance, implementing measures to protect privacy and limit and detect fraud and preventing data loss (DLP). Defining and enforcing sources of record for each data category and single sources of truth for reporting and analytics are common defensive strategies. There are cases in which defensive strategies can save costs, but implementation and management can increase costs, and as hackers become more sophisticated, the defenses need to become more sophisticated. We are seeing more software security companies implement AI-based defensive strategies, relieving that burden from businesses and other software vendors. AI is becomming an important differentiator in security software. In my experience as an Enterprise Architect and data architect, good data defenses are table stakes for many, and federal and international mandates are increasing the number of organizations for which strong defensive measures are needed.

Offensive Strategy - Slam Dunk

Offense strategies in information management often focus on the bottom line by increasing revenue, profitability, and customer satisfaction. Deploying data to support analytics that can improve customer insights with advanced data analysis, visualization and modeling is the beginning of a good data offense. Putting multiple data sources in the hands of “citizen” analysts, enabling them to slice and dice disparate customer and market data is the next step. Competitive advantage then springs from improved foundations for interactive dashboards to support complex business decision making. It is not the case with data that a good offense is the best defense. A competitive EIM strategy must include the right amounts of both, and I believe they need not be mutually incompatible.

“The more uniform data is, the easier it becomes to execute defensive processes, such as complying with regulatory requirements and implementing data-access controls. The more flexible data is—that is, the more readily it can be transformed or interpreted to meet specific business needs—the more useful it is in offense” (HBR). Resource limitations for implementing EIM strategies in organizations almost always force decision makers to prioritize efforts and funds to the automation capabilities that will bring the most value. Establishing a solid business caes for each initiative is a good discipline. Here is a table included in the HBR article that differentiate offensive and defensive data strategies:

TOGAF 9 LogoThere is much confusion and arm-waving about AI and where it can be applied to bring value in the enterprise. Whether your enterprise information management strategy is primarily offensive or defensive or both, the most important questions to ask are probably at the core capabilities level: what AI and learning processes can be applied to this core capability and will the costs justify the benefits? There are cases in which an organization has enough money and bandwidth to play around with several different ideas and see which pan out. I think those are exceptional organizations, and most budgets and teams should be focused on initiatives most likely to deliver enough business value to justify the investment. So I assembled some ideas on where to apply AI/ML to augment core EIM capabilities.

There are many processes that can be included in an Enterprise Information Management strategy beyond the ones mentioned in the table above. Notably absent are data cataloging and references to some of the finer-grained processes that uniquely apply to streaming data and big data. The TOGAF architecture standard chapter on data management services (TOGAF Ref) provides useful descriptions of these. They include:

Data Dictionary and Repository services These are repositories for data stewards, modelers, administrators and engineers to define and use the metadata that defines all data objects To simplify building and managing data dictionaries AI bots should scan data sources and automatically build and periodically update the catalog
Data dictionaries are often vendor/technology specific, but heterogeneous data dictionaries may support multiple database types and sometimes unstructured data as well Semantic associations between data tables and columns in the catalog will dramatically increase its usefulness and empower users
Intent models describing domains, subdomains, concepts and associations are based on robust semantic associations
Database Management System (DBMS) services Database admin for both ACID and BASE transactions and functionality for database admin are standard in commercial databases Determine the best data platform for each type of data and carefully consider the platform’s adaptability to support AI and automatically generated queries and intent models
Languages like SQL, NoSQL, Cypher and SparQL provide controlled access to structured data in relational and graph databases Tie all databases, tables and columns to a central/aggregate enterprise intent model
File Management services File and document structured databases and file systems are growing in importance as part of the analytics and data exploitation ecosystem Scan full text and tie all documents at a section or clause level to a central/aggregate enterprise intent model
Query Processing functions Queries provide interactive selection, extraction, insertion, deletion and formatting of information in files and databases Intelligent automated query generators that use the intent model to associate the users’ questions with data / content
Screen Generation functions Dashboards and other visualizations provide the capability to define and generate views from the retrieval, presentation, and update of data. AI can be used to adapt to each individual dashboard user’s preferences both in content (including filters and sorts) and presentation (labels, types of visualizations, arrangement…)
Report Generation functions Provide the capability to define and generate hardcopy reports composed of data extracted from a database n  Paper requires no special intelligence
Networking/Concurrent Access functions The functions that manage concurrent user access to databases are typically not included in an EIM framework AI monitoring scans to improve security and access control is growing exponentially as threats mature and proliferate
Data Warehousing functions Functions and systems that store very large amounts of data – usually captured from other database systems AI can be used on data in transit into the warehouse to ensure it is meaningfully cataloged and registered
Functions to support or perform online analytical processing on it in support of ad hoc queries Intelligent monitoring and rules can identify inefficient queries and tune or redirect them to preferred sources
Document Generic Data Typing and Conversion Services and specifications for encoding data (text, image, video, numeric, special character…) and the logical / visual structures of electronic documents/compound documents There are many use cases for data typing and conversion, but there fewer opportunities for automating decisions using AI – only AI for automating the conversion itself
Graphics Data Interchange Services for device-independent descriptions of picture elements for vector-based and raster-based graphics These services are standardized and AI opportunities are clustered around data cleansing and transformation
Specialized Data Interchange Services and specifications like EDI that describe data for specific vertical markets such as the Medical, Library, Dental, Assurance, and Oil industries These services are standardized and AI opportunities are clustered around data cleansing and transformation
Electronic Data Interchange Services for paperless office and e-commerce to improve quality, responsiveness, and savings. Such services include vendor search and selection, contracting, product catalog, shipping, forwarding, receiving, customs, e-payment, credit card, inventory control, claims, electronic tax filing and more This broad category has many potential applications of AI to facilitate, improve and exploit information in transit supporting the full range of transactions. Electronic Claims Filing for Health Insurance for example: Use AI to improve the coding for both diagnosis (ICD) and procedures (CPT)
Fax Services to create, examine, transmit, or receive fax images Use AI or natural intelligence to replace faxing entirely

Each possible enhancement should be matched with a business outcome and weighed against the cost to determine the business value. This process makes prioritization straightforward. Some experienced business leaders working with experienced technologists can analyze oppotunities and make decisions quickly. For most, formal methodologies and architecture discipline are needed. But the day will soon be upon us when not just large enterprises, but mid-sized businesses will need to integrate AI in ways that lead to competitive advantage, and information will be at the center.

By | 2019-10-01T16:57:46+00:00 May 12th, 2018|Cognitive Software, Information Management Systems|0 Comments

About the Author:

Leave A Comment