Open or Opaque Artificial Intelligence

Data is the black gold in the information era and has similar value creation and ecology to that of petroleum. Data in its raw format needs to be refined (as does crude oil) to make sense and to add meaning and usefulness to any domain.

AI and its parts (machine learning, natural language processing, deep-learning etc.) are set to be a societal game changer in all collective human imagination domains.


The ambition should be to design for a sustainable AI future, aiming to incorporate the  UNs 17 development goals with ethics at the core. One omnipresent hurdle still is the black box or opaque setting i.e. being able to understand how, why and where different AI operates and influences

The open paradigm

Since all known to man utilities with AI, have a simple model, being:

inputmodeloutput and feedback (learning).

There is a need to shift the control from the computer back towards the human, and thereby enable the addition of meaning and semantics along with conceptual models.

By using open innovation, -standards, -models (knowledge graphs, ontologies, terminologies, code systems and the like), -software, -platforms (technology stacks, i.e. Singularity net) in the design for future AI utilities and cognitive computing, there exists opportunities for  leverage learning in a meaningful way – away from the opaque regime and towards cognitive-informed artificial intelligence. Efficient communication through interoperability that can accommodate data from different semantic domains that traditionally have been separate. Open domain knowledge and data-sets (as linked-data) will provide very good platforms for continuously improved datasets within the AI loop, both in terms of refining and addressing the contextual matter, but also enabling improved precision and outcome.

Informative communication – the word’s meaning should allow accurate mental reconstruction of the senders intended meaning, but we are well aware of the human messiness (complexity) within a language as described in Information bottleneck (Tishby), rate distortion theory (Shannon).

To take on the challenges and opportunities within AI, there are strong undercurrents to build interdisciplinary capacities as with Chalmers AI Research and AI innovation of Sweden and the like. Where computer science, cognitive science, data science, information science, social sciences and more disciplines meet and swap ideas to improve value creation within different domains, while at the same time beginning to blend industry, public sector, academia and society together.

The societal challenges that lay ahead, open up for innovation, where AI-assisted utilities will augment and automate for the benefit of mankind and the earth, but to do so require a balancing act where the open paradigm is favoured. AI is designed and is an artefact, hence we need to address ethics in its design with ART (Accountability, Responsibility and Transparency) The EU draft on AI ethics.

Tinkering with AI

The emerging development of AI shows a different pathway than that of traditional software engineering. All emerging machine learning, NLP and/or Deep-Learning machinery relies on a tinkering approach with trial and error -re-model, refine data-set, test-bed with different outcomes and behaviours -before it can reach a maturity level for the industrial stages in digital infrastructure, as with Google Cloud, or similar services. A great example is image recognition and computer vision with its data optimization algorithms. and processing steps. Here each development has emerged from previous learnings and tinkering. Sometimes the development and use of mathematical models simply do not provide up for real AI matter and utilities.

Here in the value creation, or the why in the first place, we should design and use ML, NLP and Deep-Learning in the process with an expected outcome.  AI is not, and never will be the silver bullet for all problem domains in computing! Start making sense, in essence, is needed, with contextual use-cases and utilities, long before we reach Artificial General Intelligence

The 25th of April an event will cover Sustainable Knowledge Graphs and AI together with linked-data Sweden network.

Tinkering with knowledge graphs

I don’t want to sail with this ship of fools, on the opulent data sea, where people are drowning without any sense-making knowledge shores in sight. You don’t see the edge before you drop!

Knowledge EngineeringEchoencephalogram (Lars Leksell)  and neural networks

How do organisations reach a level playing field, where it is possible to create a sustainable learning organisation [cybernetics]?
(Enacted Knowledge Management practices and processes)

Sadly, in many cases, we face the tragedy of the commons!

There is an urgent need to iron out the social dilemmas and focus on motivational solutions that strive for cooperation and collective action. Knowledge deciphered with the notion of intelligence and emerging utilities with AI as an assistant with us humans. We the peoples!

To make a model of the world, to codify our knowledge and enable worldviews to complex data is nothing new per se. A Knowlege Graph – is in its essence a constituted shared narrative within the collective imagination (i.e organisation). Where facts of things and their inherited relationships and constraints define the model to be used to master the matrix.  These concepts and topics are our communication means to bridge between groups of people. Shared nomenclatures and vocabularies.

Terminology Management

Knowledge Engineering in practice

At work – building a knowledge graph – there are some pillars, that the architecture rests upon.  First and foremost is the language we use every day to undertake our practices within an organisation. The corpus of concepts, topics and things that revolve around the overarching theme. No entity act in a vacuum with no shared concepts. Humans coordinate work practices by shared narratives embedded into concepts and their translations from person to person. This communication might be using different means, like cuneiform (in ancient Babel) or digital tools of today. To curate, cultivate and nurture a good organisational vocabulary, we also need to develop practices and disciplines that to some extent renders similarities to ancient clay-tablet librarians. Organising principles, to the organising system (information system, applications).  This discipline could be defined as taxonomists (taxonomy manager) or knowledge engineers. (or information architect)

Set the scope – no need to boil the ocean

All organisations, independent of business vertical, have known domain concepts that either are defined by standards, code systems or open vocabularies. A good idea will obviously be to first go foraging in the sea of terminologies, to link, re-hash/re-use and manage the domain. The second task in this scoping effort will be to audit and map the internal terrain of content corpora. Since information is scattered across a multitude of organising systems, but within these, there are pockets of a structure. Here we will find glossaries, controlled vocabularies, data-models and the like.  The taxonomist will then together with subject matter experts arrange governance principles and engage in conversations on how the outer and inner loop of concepts link, and start to build domain-specific taxonomies. Preferable using the simple knowledge organisation system (SKOS) standard

Participatory Design from inception

Concepts and their resource description will need to be evaluated and semantically enhanced with several different worldviews from all practices and disciplines within the organisation. Concepts might have a different meaning. Meaning is subjective, demographic, socio-political, and complex. Meaning sometimes gets lost in translation (between different communities of practices).

The best approach to get a highly participatory design in the development of a sustainable model is by simply publish the concepts as open thesauri. A great example is the HealthDirect thesaurus. This service becomes a canonical reference that people are able to search, navigate and annotate.

It is smart to let people edit and refine and comment (annotate) in the same manner as the Wikipedia evolves, i.e edit wiki data entries. These annotations will then feedback to the governance network of the terminologies. 

Term Uppdate

Link to organising systems

All models (taxonomies, vocabularies, ontologies etc.) should be interlinked to the existing base of organising systems (information systems [IS]) or platforms. Most IS’s have schemas and in-built models and business rules to serve as applications for a specific use-case.  This implies also the use of concepts to define and describe the data in metadata, as reference data tables or as user experience controls. In all these lego pieces within an IS or platform, there are opportunities to link these concepts to the shared narratives in the terminology service.  Linked-enterprise-data building a web of meaning, and opening up for a more interoperable information landscape.

One omnipresent quest is to set-up a sound content model and design for i.e Office 365, where content types, collections, resource descriptions and metadata have to be concerted in the back-end services as managed-metadata-service. Within these features and capacities, it is wise to integrate with the semantic layer. (terminologies, and graphs). Other highly relevant integrations relate to search-as-a-service, where the semantic layer co-acts in the pipeline steps, add semantics, link, auto-classify and disambiguate with entity extraction. In the user experience journey, the semantic layer augments and connect things. Which is for instance how Microsoft Graph has been ingrained all through their platform. Search and semantics push the envelope 😉

Data integration and information mechanics

A decoupled information systems architecture using an enterprise service bus (messaging techniques) is by far the most used model.  To enable a sustainable data integration, there is a need to have a data architecture and clear integration design. Adjacent to the data integration, are means for cleaning up data and harmonise data-sets into a cohesive matter, extract-load-transfer [etl]. Data Governance is essential! In this ballpark we also find cues to master data management. Data and information have fluid properties, and the flow has to be seamless and smooth.  

When defining the message structure (asynchronous) in information exchange protocols and packages. It is highly desired to rely on standards, well-defined models (ontologies). As within the healthcare & life science domain using Hl7/FHIR.  These standards have domain-models with entities, properties, relations and graphs. The data serialisation for data exchange might use XML or RDF (JSON-LD, Turtle etc.). The value-set (namespaces) for properties will be possible to link to SKOS vocabularies with terms.

Query the graph

Knowledge engineering is both setting the useful terminologies into action, but also load, refine and develop ontologies (information models, data models). There are many very useful open ontologies that could or should be used and refined by the taxonomists, i.e ISA2 Core Vocabularies, With data-sets stored in a graph (triplestore) there are many ways to query the graph to get results and insights (links). Either by using SPARQL (similar to SQL in schema-based systems), or combine this with SHACL (constraints) or via Restful APIs.

These means to query the knowledge graph will be one reasoning to add semantics to data integration as described above.

Adding smartness and we are all done…

Semantic AI or means to bridge between symbolic representation (semantics) and machine learning (ML), natural language processing (NLP), and deep-learning is where all thing come together.

In the works (knowledge engineering) to build the knowledge graph, and govern it, it taxes many manual steps as mapping models, standards and large corpora of terminologies.  Here AI capacities enable automation and continuous improvements with learning networks. Understanding human capacities and intelligence, unpacking the neurosciences (as Lars Leksell) combined with neural-networks will be our road ahead with safe and sustainable uses of AI.
View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog