Organisations are facing new types of information challenges in the AI-hype. At least the use cases, the data and the technology are different. The recommended approach and the currently experienced findability challenges remains however the same.
Findability is getting worse as the data landscape is changing
As clearly shown in the result of the 2019 Search & Findability Survey, finding relevant information is still a major challenge to most organisations. In the internal context as many as 55% find it difficult or very difficult to find information which brings us back to same levels that we saw in the survey results from 2012 and 2013.
Given the main obstacles that the respondents experience to improve search and findability, this is not very surprising:
- Lack of resources/staff
- Lack of ownership/mandate
- Poor information quality
One reason behind the poor information quality might be the decreasing focus and efforts spent on traditional information management activities such as content life cycle, controlled vocabularies and metadata standards as illustrated in the below diagrams*. In 2015-16 we saw an increase in these activities which made perfect sense since “lack of tags” or “inconsistent tagging” was considered the largest obstacles for findability in 2013-2015. Unfortunately, the lack of attention to these areas don’t seem to indicate that the information quality has improved, rather the opposite.
(*percent working with the noted areas)
A likely reason behind the experienced obstacles and the lack of resources to improve search and findability is a shift of focus in data and metadata management efforts following the rapid restructuring of the data landscape. In the era of digital transformation, attention is rather on the challenge to identify, collect and store the massive amounts of data that is being generated from all sorts of systems and sensors, both within and outside the enterprise. As a result, it is no longer only unstructured information and documents that are hard to find but all sorts of data that are being aggregated in data lakes and similar data storage solutions.
Does this mean that search and findability of unstructured information is no longer relevant? No, but in combination with finding individual documents, the target groups in focus (typically Data Scientists) have an interest in finding relevant and related data(sets) from various parts of the organisation in order to perform their analysis.
Digital (or data-driven) transformation is often focused on utilising data in combination with new technology to reach level 3 and 4 in the below “pyramid of data-driven transformation” (from In search for insight):
This fact is also illustrated by the technology trends that we can see from the survey results and that is presented in the article “What are organisations planning to focus on to improve Search and Findability?”. Two of the most emerging technologies are Natural Language Processing (NLP) and Machine Learning which are both key components in what is often labelled as “AI”. To use AI to drive transformation has become the ultimate goal for many organisations.
However, as the pyramid clearly shows, to realise digital transformation, automation and AI, you must start by sorting out the mess. If not, the mess will grow by the minute, quickly turning the data lake into a swamp. One of the biggest challenges for organisations in realising digital transformation initiatives still lies in how to access and use the right data.
New data and use cases – same approach and challenges
The survey results indicate that, irrespective of what type of data that you want to make useful, you need to take a holistic approach to succeed. In other words, if you want to get passed the POC-phase and achieve true digital transformation you must consider all perspectives:
- Business – Identify the business challenge and form a common vision of the solution
- User – Get to know your users and what it takes to form a successful solution
- Information – Identify relevant data and make it meaningful and F.A.I.R.*
- Technology – Evaluate and select the technology that is best fit for purpose
- Organisation – Establish roles and responsibilities to manage and improve the solution over time
You might recognise the five findability dimensions that was originally introduced back in 2010 and that are more relevant than ever in the new data landscape. The survey results and the experienced obstacles indicate that the main challenges will remain and even increase within the dimensions of information and organisation.
Also, it is important to remember that to create value from information it is not always necessary to aim for the top of the pyramid. In many cases it is enough to extract knowledge and thereby provide better insights and decision support by aggregating relevant data from different sources. Given that the data quality is good enough that is.
*The strategy to get a sustainable data management, implies leaning upon the FAIR Data Principles
- Make data Findable, through persistent ID, rich metadata, indexes and combine id+index.
- Make data Accessible, through standard communication protocols, open and free protocols and authentication mechanism where necessary and always keep metadata available.
- Make data Interoperable, through the use of vocabularies, terminologies, glossaries, use open vocabularies/models and link the metadata.
- Finally make data Reusable, by using multiple metadata attributes, set constraints based on licenses, and express provenance to build trusted and quality datasets leaning upon community standards.
Author: Mattias Ellison, Findability Business Consultant