Semantic Annotation (how to make stuff findable, and more)

With semantic annotation, your customers and employees can get the right information in order to make better decisions

Why automatic Semantic Annotation?  

Empower customers & employees with the right information 

Moving data and services to the Cloud have many advantages, including the flexibility of work practices. COVID-19 has boosted thtrend and many organisations are benefiting from employees also being able to work from home. If employees are to become customers themselves, they should be expecting a quality Search service. Semantic Annotation can help with this.

For many employees, finding information is still a problem. Having poor Search does little to encourage users either to use it, or to improve their decision-making, knowledge sharing or curiosity & innovation. Let’s not forget, better search means less duplication too. 

Making data and content “smarter” makes it more findable. 

 

Data and content are rarely structured with good metadata or tagging (annotation) unless either they are being used to sell something, or they are deemed as business critical. Generally, when we create (data, content), we just save it to storage(s). 

We could tag manually, but research shows that we’re not good at this. Even if we bother to tag, we only do it from our own perspective, and even then, we do it inconsistently over time.  

Alternatively, we could let AI do the work. Give data/content structure, meaning and context (all automatically and consistently), so that it can be found. 

The main need for automatic Semantic Annotation? About 70-80% of the average organisation’s data is unstructured (/textual). Add to this: even databases have textual labels and headings. 

How to create automatic Semantic Annotation?  

Use stored knowledge (from an Enterprise Knowledge Graph) 

When thinking about the long-term data health of an organisation, the most effective and sustainable way to set up semantic annotation, is to create your own Enterprise Knowledge Graph (which can then be used for multiple usecase scenarios, not just annotation). 

In an Enterprise Knowledge Graph (EKG), an organisation can store its key knowledge (taxonomies, thesauri, ontologies, business rules). Tooling now exists so that business owners and domain experts can collaboratively add their knowledge, not having to know about the underlying semantic web-based technologies, the ones that allow your machines and applications to read this knowledge as well (before making their decisions). 

 Your EKG is best created using both human input and AI (NLP & ML = Natural Language Processing & Machine Learning). The AI part exploits your existing data plus any existing industry-standard terminologies or ontologies that fit your business needs (you may want to just be able to link to them). While the automation of EKG creation is set to improve, EKG robustness can be tested by using corpus analysis with your data to find any key business concepts that are missing.

How does automatic Semantic Annotation work?  

Smart processing 

Despite improvements in search features and functionality, Search in the digital workplace may still have that long-tail of search – where the lessfrequent queries are harder to cater for. With an EKG annotation process, the quality of search results can significantly improve. Processing takes extracted concepts (Named Entity Recognition) from the resource asset that needs to be annotated. It then finds all the relationships that link these concepts to other concepts within the graphIn doing so, the aboutness of the asset is calculated using an algorithm before appropriate annotation takes place. The annotations go to making an improved index. The process essentially makes your data assets “smarter,” and therefore, more findable.  

Processing also includes shadow concept annotations – the adding of concept tag where the concept itself does not appear within the resource asset, but which perfectly describes the resource (thanks to known concept relationships in the graph). Similarly, the quality of retrieved search results can be increased as the annotation process reduces the ambiguity about the meaning of certain concepts e.g. it differentiates between Apple (the brand) and apple (the fruit) by virtue of their connections to other concepts i.e. it can answer: are we talking tech or snacks? 

Your preferred tooling may be that which supports the parthumanexpert maintenance of key business language (taxonomies – including phrases, alternative labels, acronyms, synonyms etc). Thus, the EKG is used for differing language and culture perspectives of both customers and employees (think Diversity & Inclusion). And of course, search just gets better when linked to any user profile concepts for personalisation. 

Analysis of search queries to find “new” language, means that business language can be kept “alive,” and reflect both your data and query trends (typed and spoken). Resultant APIs can offer many different UX options e.g. for “misfired” queries: clickable, search-generating related concepts, or broader/narrower concepts for decreased/increased search granularity.

What are the alternatives? 

EKGs, AI enhancements and COTS 

There are several providers of commercial knowledge engineering and graph software in the market, many of whom Findwise partner with. As EKGs are RDF-based, once made, they are transferrable between software products, should the need arise. 

Incremental AI-based algorithmic additions can be added to improve existing search (e.g. classifiers, vector embeddings etc), having more of a single-focus, single-system perspective. Very often these same enhancement techniques can also provide input for improving and automating EKGs – just as the EKG can offer logical base and rules for a robust AI engineering strategy. 

EKGs offer a hybrid architecture with open source search engines. There are of course commercial off-the-shelf solutions (COTS) that offer improved search over data assets (often also with a graph behind them). But before you go for any vendor lock in, check what it is you need and if they cover all or any of the possible EKG-related scenarios: 

Are they inclusive of all your data? Do they help formalise data governance and accountability framework? Is the AI transparent enough to understand? Can your information and business model(s) be built in and be reflected in data structures? How easy would it be to alter your business model(s) and see such changes reflected in data structures

Does the software solution cope with other use cases? e.g. Data findability? FAIR data? Do they have multilingual functionality? Can they help make your data interoperable or connected with your ecosystem or Web data? Do they support potential data-centric solutions or just application-centric ones?

 

Semantic Annotation: How to make it happen? 

Your ultimate choice may be the degree to which you want or need to control your data and data assets, plus how important it is for your organisation to monitor their usage by customers and employees. 

EKGs are mostly introduced into an organisation via a singular use case rather the result of a future-looking, holistic, data-centric strategy – though this is not unheard ofThat said, introducing automatic Semantic Annotation with an EKG could prove a great follow up to your organisation’s Cloud project, as together they can dramatically increase the value of your data assets within the first processing. 

For an example of an implemented semantic annotation use case, click here: NHS Learning Hub, a collaborative Health Education England and Findwise project. 

Alternatively check out Findability by Findwise and reach out to get the very best digital transformation roadmap for your organisation.

Peter Voisey     Linkedin   Twitter

Design Elements of Search – Zero Results Page

The sixth and last part in this series, Design Elements of Search is dedicated to the zero results page. This lonely place is where your users end up when the search solution doesn’t find anything. Do your best to be friendly and helpful to your users here, will you?

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing Zero Results Page

The design, function and layout of your zero results page gossip about the quality of your search solution. This page is often forgotten and discussed last (like in this series). Whenever I review existing search solutions, this is where I start, because a lot of problems with existing search solutions show up here. You need to understand that from the user’s perspective, ending up on a zero results page can be a frustrating experience. You need to help the user recover from this state. Below is a good example from one of our clients. The intranet of the Swedish courts. The page clearly explains what has happened, No documents were found.

zero results page clearly explains what has happened

A good zero results page that clearly explains “No documents were found”.

Providing further Help

Sometimes there is nothing the system can do to deliver results. The last resort is when it’s time to ask your user to alter their query. Sometimes the query is misspelled or otherwise not optimal. You can copy and use this text on your own zero results page if you like.

  • Check that all words are spelled correctly
  • Try a different search
  • Try a more general search
  • Use fewer search terms

Avoid digging a deeper hole

Microsoft’s OneDrive provides a beautiful zero results page below, but they make a big mistake by showing filtering options in this state. This makes no sense, if there already are no results, there will definitely not be more by narrowing down the search scope further. Avoid this mistake!

avoid providing more filtering options on you zero results page

Pretty looking, but bad zero results page because of the filters on the right hand side.

That was it! The whole Design Elements of Search series is done. This is not everything however, designing a search solution is deeper than this. Me and my friends at Findwise will gladly help you realize all of your dreams. Ok maybe not all of them, but your search related dreams maybe? Ok, that was awkward.

See you in the future, best regards //Emil Mauritzson

Get in touch

Contact Findwise

Contact Emil Mauritzson

Design Elements of Search – Landing Page

We have just covered the area of results in the previous post, I hope that was fun, you are still here. That means you are ready for more, awesome! Let’s get into it. Here is the fifth part in the series Design Elements of Search, landing pages, whatever can it be?

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing Landing Pages

What normally happens when you click a search result? The answer seems obvious, you are sent to that document or that webpage or that product. Easy peasy.

diagram for how traditional search sends users to another webpage when clicking results

Traditionally you leave the search solution when clicking results.

However, during my years of consulting, I have come across multiple cases where we don’t know where to send users, because there is no obvious destination. Consider a result for an employee, a product, a process or a project. Sometimes there is no existing holistic view for these information objects. In these cases, we suggest building that holistic view in something we at Findwise call landing pages. When we use landing pages for certain results, users remain inside the search application when they click a result like this. Unlike a traditional search interfaces that sends users away to another application, or document.

design landing page ux diagram for how modern search can send users to a landing page

Get to the landing pages from the ordinary results page.

Paving the path

On landing pages, we show relationships between a variety of information objects we have in the search index. Let me describe it this way.

Sarah works as an architect. In her daily work she needs to be up to date regarding certain types of projects within her area of expertise. Therefore, Sarah is now doing research on how a certain material was used in a certain type of construction. She searches for “concrete bridges” and sees that there are 12 project results. Sarah looks over the results and clicks the third project and sees the landing page for that project. Here, she can see high level information about the project, and also see who the project members have been. Sarah sees Arianna Fowler and also more people. Sarah is curious about the person Peter Fisher because that name sounds familiar. She now sees the landing page for Peter. Here she can see all the projects Peter has been working on. She sees Peters most recent documents. She sees his close collogues. Sarah sees that Peter has been working in multiple projects that has used concrete as the main material. However, when she calls Peter, she learns he is not available right now. Therefore, Sarah decides to call Peters closest colleague. The system has identified close colleagues by knowing how many projects people have been working on together. Sarah calls Donna Spencer instead, because Donna and Peter has collaborated in 12 projects in the last five years. Sarah gets to know everything she needed and is left in a good mood.

Interesting paths

Your specific use case determines what information makes sense to show in these landing pages. Whatever you choose, you will set your users up for interesting paths of information finding and browsing, by connecting at least two information objects with landing pages. See illustration below.

diagram for how modern search can set users up for content discovery

Infinite discovery made possible by linking landing pages together.

When you look past the old way of linking users directly to documents and systems and instead making it possible to find unexpected connections between things. You have widened the definition of what enterprise search can be. This is a new way of delivering value to your organization using search.

This marks the end of the fifth part, next up you’ll read about what happens when a search yields zero results, and what you should do about that.

Get in touch

Contact Findwise

Contact Emil Mauritzson

Design Elements of Search – Results

You are currently reading the fourth part in the series Design Elements of Search. This part is about the search results. The actual results certainly is the most central part of an entire search solution, so it’s important to get this part right. Don’t worry, I’ll show you how.

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing Results

Let’s say you are satisfied with the relevance model for now, how on earth do you design good looking and good performing results? If your indexed information mostly is text documents, your results will likely have a title and a snippet, that’s good – But it’s all the other things you include in the result that make it great. For each content source you have, you’ll need to think about what your target audience want to see. You’ll want your users to be able to understand if this seem like the right result or not.

Snippet

A snippet is the chunk of text presented on search results, usually below the title. If you have a 1000 words long PDF, and the user search for a word in a document. The search engine will show some words before the search term, and some words after. These snippets usually start with three dots … to indicate that the text is cut off. Snippets helps your user understand what this document is about. If it seems interesting, the user can decide to click on the result.

A regular search result

A regular search result from www.startpage.com.

Context

If you have indexed documents from a file share, provide the folder structure as breadcrumbs. Bonus points for making the individual folders clickable. If you have indexed webpages, show the URL as breadcrumbs. Make the individual pages clickable. Not all subpages make sense to navigate to, depending on your structure. Bonus points to you if you exclude these from being links. Below you see a webpage being located in “University -> Home -> Departments -> Mathematical Sciences -> Research”. This context is valuable information that helps your user understand what to expect of this search result.

providing the url for context is good on a search result

The url is used to communicate context, answering the question “where is this page located on the site”.

What Type is this Result?

When you index data sets from different sources and make them findable in a common search interface, you need to be as clear as possible about helping your user understand – “What is this result?”. Show clearly with a label if the result is a guide, a blogpost, a steering document, a product, a person, a case study, and so on. You want to have descriptive labels, not general ones like document, webpage or file. These general labels seldom make sense to users. Again, your labels and how you enable slicing and dicing of the data is the result of the IA work done, and not directly covered in this series.

Filetype

I just said above that the label “Document” doesn’t make much sense. That’s not the same thing as showing what filetype the current document has. It is sometimes helpful to know if this File is a PDF-file or a Word-file. Like Google and other search engines, show the filetype to the left of the title, in a little box. If your company uses the Microsoft Office, you can have labels like Word, Excel, PowerPoint. If you design for a general audience it makes more sense to use labels like DOC, XLS, PPT.

This is a good place to use colors, most word processors icons are blue, like Microsoft Word and Google Docs. Excel and Google Sheets is green. Adobe Reader is red. Regarding variations of filetypes, help your users by not bothering them with the difference of XLS and XLSX, or DOC and DOCX and so on. Just call them XLS and DOC. Since filetype also often is a filter. Excluding the different variants of the same file format will reduce the number of options in the list. Below we use colors, icons and labels to communicate filetype.

Showing the file extension and icon and a color is good for filetypes of a result

The filetype is clearly visible and communicated through text, icon and color.

Highlighting

Showing your users how results are matching the query is a key component of a well-liked and well understood search solution. In practice, highlighting means that if the user search for “summer vacation”, you provide a different styling on the words “summer” and “vacation” on the result. Most of the time, snippets come standard with highlighting, either in bold or in italics. In order to provide meaningful results, show highligting everywhere on the result. This means that if the matching terms are in the title, highlight that. If it’s in the breadcrumb, highlight that. Also, you can get creative and highlight in other ways than bold or italics, just see below.

showing where the search term matched os good

Search result with “summer” highlighted.

Here we try to mimic the look and feel of an actual highlighting pen, pretty neat.

highlighting looks like an actual pen

Highlighting up-close.

Time

When you are searching a webpage, an intranet or something else for that matter. Always show date of publication, or date of revision if you have that. Otherwise how would you know if the document “Release tables March 29” is recent, or very old? Many people get this basic thing wrong, don’t be one of them!

Be bold, but be Right

In order for your users to understand what data you are showing on the result, the data need a label describing it, like “Author: Emil Mauritzson”. All good so far. The most important thing is the data (Emil Mauritzson), not the label (Author). I see many getting this wrong and highlight the label. Highlight the data instead.

Visual focus on the data not the label is a best practice for search results

Make the most important thing most visible.

So, there’s that. The part about results is complete. If you are ready for more, get on to the next part, the one about what we call landing pages, whatever that can be…Exciting!

Get in touch

Contact Findwise

Contact Emil Mauritzson

Design Elements of Search – Filters

Hey, I’m happy you have found your way here, you are currently reading the third part in the series Design Elements of Search. This part is dedicated to filters, tabs and something we like to call filter tags.

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing Filters

When setting up new search solutions, we tend to spend a lot of time with the data structure. How should our users slice and dice the search-results? What makes sense? What does not? This is the part of the job sometimes classified as Information Architecture (IA). This text focuses more on the visual elements, the results of the IA work you can say.

Don’t make it difficult

The biggest pitfall when designing search is to overwhelm the user with too many options.

You got a million hits! – There are 345566 pages – Here are some results, Do you only want to see People results? – Sort by Price, Ascending or Descending?! – Click me – Did you mean: Coffee buns? – Click me – CLICK MEEEE! Yep, try to tone this down if you can.

Below you’ll see a disastrous layout. There is so many things screaming for users’ attention. If you look really hard, you can see a search result all the way down in the bottom of the picture.

image of a busy search interface

The original interface, very little room for results.

I said above that we spend a lot of time on the structure (IA). And we generally spend a lot of time on filters as well. This time is well spent. However, we need to realize that what is most important for our users. Do they find what they are looking for, or not? The order of the search results, i.e. the relevance is most important. Therefore, the actual search results should be totally in focus, visually in your interface.

Make it Easy

Instead of giving your users too many options up-front, consider hiding filters under a button or link. The button can say “Filter search results”, or “Refine results” or “Filter and Sort”. I’ll show you what I mean below. I have removed and renamed things from the above example, creating a design mockup. It’s not a perfect redesign, but you get my point, hopefully. All of a sudden there is room for three results on screen, success!

image of a not so busy search interface

A cleaned up interface, more room for results.

The second example is a sneak peek of White Arkitekter internal search solution. Here we can follow the user searching from the start page and applying a filter. The search results are in focus, and at the same time it’s easy to apply filters when needed. A good example.

animated gif showing a search interface and filters

Showing how easy a filter is applied.

Search inside Filters

In the best case, a specific filter will contain a handful of values that are easily scanned just by looking at the list. In reality however, often these lists of filter values are long. How should you sort the list? Often, we sort them by “most first”, sometimes alphabetically. When the list is not easily scannable, provide a way to “search” inside the filter. Like this:

animated gif showing a how to search inside filters

Typing inside this filter is helping the user more quickly find “Sweden”.

Filters values with Zero Results

Hey, if a filter value will yield zero results, like Calendar, Local files and Archived files below. Show the filter value but don’t make it clickable! Why on earth would you want that? You don’t want to send your users to a dead end. Sometimes they will end up there anyway, and then you have to help. Skip ahead to the part about the Zero Results Page to learn about how to help users recover.

You should not be able to click a filter with zero results

A filter with some values returning zero results. Good to show them, but important to make them not clickable.

Filter tags

I said above that the results should be the graphical element that stands out the most. And also, that making the first refinement should be easy to make. Well, this will mean that the filters will be hidden behind something. This does not mean, by the way, that the filter selection made by the user, should be hidden. On the contrary. You definitely want to be clear about what things affect the search results. This is normally the query, the filter selections and the sorting. A filter tag is simply a graphical element that is clearly visible above the search results when activated. It is also easy to remove it, simply by clicking on it. Below, I show you an example when the user has filtered on “News”.

filter is apllied and renders a filter tag

“News” is the active filter. A green filter tag is visible and is easy to see and easy to remove.

If you are up to a third example of filters check this case study out about Personalized search results in Netflix-style user interface.

This was all I had for you regarding filters. I hope some of it made sense, if not let’s get in touch, you can ask me about more details. Or perhaps tell me something I have missed. Always be learning! Next post will discuss results, see you over there.

Further reading

Information Architecture Basics

Filters vs. Facets: Definitions

Mobile Faceted Search with a Tray: New and Improved Design Pattern

Get in touch

Contact Findwise

Contact Emil Mauritzson

Design Elements of Search – Autocomplete Suggestions

You are currently reading the second part in the series Design Elements of Search, the one about autocomplete suggestions. When you’re typing text into the search bar, something is happening just below. A list of words relevant to the text appears. You probably know this from Google and around the web. I will share my findings and some best practices for autocomplete suggestions now. Call me a search-nerd, because I really enjoy implementing awesome autocomplete features!

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing Autocomplete Suggestions

I bet you recognize this? It just works right. But how do you get here? Read on and I will tell you.

animated god showing google autocompleter

How autocomplete works at google, a solid experience.

Instant Search

Autocomplete suggestions is a nice feature to offer when you expect your users to execute the query by clicking the search-icon or pressing the enter key. However sometimes your search solution is set up in such a way that for each character the user enters, a new search is performed automatically, this is called instant search. When this is the case you do not want autocomplete suggestions. Google experimented with instant search a few years ago. Google decided to revert back due to a few reasons. However, providing instant search in your use case might still be a good idea. In my experience instant search works well for structured data sets, like a product catalogue, or similar. When your information is diversified, the results could be either documents, web pages, images, people, videos and so on, you are probably better of providing traditional search in combination with autocomplete suggestions.

Suggestions based on User Queries

In my experience, using queries as the foundation for suggestions is the way to go. You can’t just take all queries and potentially suggest it to your entire user base though. What happens if you have a bad actor who want to troll and mess up your suggestions? Let’s say a popular query among your users is “money transfer” and your bad actor searches for something as nasty as “monkeyballs” 100 times. How do you make sure to provide the right suggestion when your user types “mon” in the search bar? You definitely don’t want your search team to actively monitor your potential autocomplete suggestions and manually weed out the bad ones.

One effective method we use is to check if the query matches any document in the index. Hopefully (!?) you do not have any document containing the word “monkeyballs” in your index, and therefore these terms will not be suggested to your users in the autocomplete suggestions. Using this method will make sure your suggestions is always domain specific to your particular case.

Another safeguard to ensure high quality suggestions is to have a threshold. A threshold means a query need to be performed X amount of times before it ends up as a potential suggested term. You can experiment with this threshold in your specific case for the best effect. This threshold will weed out “strange” queries like seemingly random numbers and other queries entered by mistake, that happens to yield some results.

Here is a high-level architecture of a successfully implemented autocomplete suggester at a large client.

architectural image showing autocomplete behind the scenes

Architectural overview of a good performing autocomplete suggester implemented at a client.

Right information, in the right time

So far, I have explained how to weed out the poor and nasty terms. More importantly however, how do you suggest terms in a good order? Basically, to achieve this, we consider the more people searching for something, the higher up the term will be in the list of suggestions. How do you solve the following case? Let’s say summer is coming up, and people are interested in “Vacation planning 2020”, how do you provide this suggestion above “Vacation planning 2019” in the spring of 2020? The term “Vacation Planning 2019” have been searched for 10.000 times and “Vacation planning 2020” only have been searched for 200 times?

Basically, you need to consider when these searches have been performed, and value recency together with number of searches. I don’t have an exact formula to share, but as you can see in the high-level architecture, we divide the queries on “last year, last month, last week”. Getting a good balance here will help boost recent queries that will be of interest to your users.

Add Static lists

Sometimes, you possess high quality lists of words that you want to appear in the autocomplete suggestions without the users first searching for them. Then you can populate the suggestions manually once. You may have a list of all the conference room names in your building, you may have a list of subjects that content creators use to tag documents. Please go ahead and use lists like this in your autocomplete suggestions.

Highlight the right thing

When presenting search results on the results page, you want to highlight where the query matched the document. Read about Results in the fourth part in this series. In the autocomplete suggestions however, you want to do the opposite. In this state, users know what characters they just entered, they are looking for what you are suggesting, this is what you highlight.

example of do and dont - highlight

Highlighting what comes after, not what the user has already entered.

Here we are, right at the end of autocomplete suggestions. Coming up in the next part, I will give you details about filters. Filters is surprisingly difficult to get right. But with some effort, it’s possible to make them shine. See you on the other side.

Further reading

13 Design Patterns for Autocomplete Suggestions

Get in touch

Contact Findwise

Contact Emil Mauritzson

Design Elements of Search – The Search Bar

Time for the first part in the series Design Elements of Search. How do you design a search solution so that it provides value to your organization? How do you make sure users enjoy, use and actually find what they expect? There are already so many great implementations of successful search applications, what can we learn from them? If these questions are in your domain, then you have reached the right place. Buckle up, you are in for a ride! Let’s dive into it right away by discussing the search bar.

A blog series – Six posts about Design Elements of Search


A word on Technology and Relevance – a disclaimer

Equally important as having a good user interface is having the right technology and the right relevance model set-up. I will not cover technology and relevance in this blog series. If you wish to read more, these topics is well covered by Findwise since before: Improve search relevancy  and Findwise.com/technology.


Designing the Search Bar

To set the scene and get cozy, here are some search bars.

Animated gif showing a variety of different search bars

A selection of search bars, for your pleasure.

Placing the search bar in the “right” place

Before discussing the individual graphical elements of the search bar, let’s consider where a search bar can be placed. On the search page itself, it normally resides in the top of the page (think Google). However, consider the vast landscape of your digital workplace and you might understand where I am going. A search bar can be placed on your intranet, usually in the header. It can be placed in the taskbar of your workforces’ computers. It can be placed in multiple other business applications in your control. From our perspective this is called entry points. It is well worth following up where your users come from. This is only one data point, you definitely want to follow up more usage statistics. You want to be data informed. In our client projects we usually use Kibana for statistics, showing graphs in custom dashboards. Before redesigning something, we first analyze existing usage statistics, and then follow up with users to draw conclusions that will inform design decisions. I’ll stop talking about usage statistics now, let’s go ahead and break down the search bar.

Placeholder Text

A placeholder text invites users to the search bar. The placeholder text explains what your users can expect to find in this search solution. While respecting the tone of voice of your application, it doesn’t hurt to be friendly and helpful here. Examples of good placeholder texts is: “What are you looking for today?” “How can we help?”  “Find people, projects and more”. H&M, the clothing store have implemented a dynamic placeholder text that animates in a neat way.

Placeholder text from IKEA that animates

Animated placeholder text that sparks interest in the different kind of things you can search for at IKEA.com

Google Photos is switching it around and suggests what you can search for based on the meta data of your uploaded photos, here are a few examples.

placeholder text from google showing a variety of different texts

A variety of placeholder texts helping the user discover what can be searched for. The text is also personalized.

The placeholder text should be gray, so that the text is not mistaken to be actual data entered into the search bar. The placeholder text should immediately disappear when your user starts typing.

Contrast

Make sure the color of the search bar and the background color of the page provides enough contrast so that the search bar is clearly visible. It’s is also fine to have the same color if you provide a border around the search bar with enough contrast. Here a few good examples, and some bad.

High Contrast

screenshot of bing start page

Clearly enough contrast on Bing.com

screenshot of Dustin.com providing good contrast

Easy to find the search bar on Dustin

Low Contrast

Google actually have low contrast on the border surrounding the search bar. The search bar also has the same color as the page. Normally this is something to avoid. There is few items on the page, and users expect to search at Google.com, so they get away with low contrast I guess. Still, Bing is better in this regard.

screenshot of Google.com providing poor contrast

Too little contrast on Google.

Screenshot of search bar with too little contrast

Where is the search bar? Look hard.

If you are unsure, check if your current colors provide enough contrast using an online Contrast Checker.Chances are your contrasts are too low and need improvement.

The Search Button

This is the button that performs the search. Many people use the Enter key on their keyboard instead of clicking this button. However, you still want to keep the search button for clarity and ease of use. Generally, all icons should have labels. The search button is one of the few icons for which it´s safe to skip the label. I can argue that the search icon is generally recognized, especially in the context of search. On the other hand, if you have the room. Why not use a label? I mean it cannot be clearer than this:

Screenshot of Försäkringskassan having good labels

Clearly labeled buttons, easy to comprehend.

Clear the search bar easily with an “X”

As frequently implemented on mobile applications, you should provide an easy way of clearing the text-field on your desktop application. This is accomplished by an “X”-icon. As discussed above, not many icons are recognized by majority of users. Therefore, it is common practice to provide labels for icons. For the “X”-icon in this specific context, is also fine to skip the label.

a search bar that makes it easy to remove the typed text

Make the text easy to remove.

Number of Results

After the query has been executed and results are showing, it is helpful to communicate how many results that were returned. This provides value in itself, and in combination with filters it is even more powerful. Telling the users how many results were returned is helping them understand how your search application is working, especially in combinations with applied filters. Skip ahead to Filters and read all about it. Avoid sounding like a robot, don’t say “Showing 10 of 28482 results on Pages 1-2849. Plainly say “Showing 123 results” or “123 results found”.

example of do and dont - number of results

Make your search solution friendly and approachable, not robotic and stiff.

Did you mean

Use the power of search technologies and query analysis to give your users the option to adjust the initial query for the better. Sometimes you will suggest a correctly spelled query when your user misspelled, or you can suggest alternative phrases or other related terms.

did you mean example

The search solution can help you spell words correctly.

Here we are, right at the end of the first part. I hope it was compelling, there is more where this came from, so keep on reading. To sum up this first part, when designing the search bar, just the obvious things need to be right. In the second part, you’ll get to know something called autocomplete suggestions. This feature helps your users formulate better queries, and that really is a good start.

Further reading

How to design: accessible search bars

Design a Perfect Search Box

Get in touch

Contact Findwise

Contact Emil Mauritzson

Making your data F.A.I.R and smart

This is the second post in a new series by Fredric Landqvist & Peter Voisey, explaining how your organisation could best shape its data landscape for the future.

How to create a smart data framework for your organisation

In our last post for you, we presented the benefits of F.A.I.R data, how to make data smarter for search engines and the potentials of an Information Commons. In this post, we’re giving you the pragmatic steps to make your data FAIR by creating and applying your own smart data framework. Your data-sharing dream, internally and externally, is possible.

A smart data framework, using FAIR data principles, encompasses the tooling, models and standards that govern datasets and the different context-specific information systems (registers, catalogues). The data is then ingested and processed (enriched/refined) into smart data, datasets and data catalogues. It can then be used and reused by different applications and e-services via open APIs. In this ecosystem, all actors and information behaviours (personas) interplay: provision agents, owners, builders, enrichers, end-user searchers and referrers.

The workings of a smart data framework

A smart data & metadata catalogue   

A smart data & metadata catalogue (illustrated below), provides an organisational capability that aligns data management with the FAIR data principles. View it not so much as one system to rule them all, but rather an ecosystem that is smart and sustainable. In order to simplify your complex and heterogeneous information environment, this set-up can be  instantiated, as one overarching mechanism. Although we are describing a data and metadata catalogue here, the exact same framework and set up would of course apply also to your organisation’s content, making it smarter and more findable (i.e. it gets the sustainable stamp).

Smart Data Catalogue
The necessary services and component of a smart data catalogue

The above picture illustrates the services and components that, together, build smart data and metadata catalogue capabilities. We now describe each one of them for you:

Processing (Ingestion & Enrichment) for great Findability & Interoperability

  • (A) Ingest, harvest and operate. Here you connect the heterogeneous data sources for ingestion.

The configured input mechanisms describe each of the data sources, with their data, datasets and metadata ready for your catalogue search. Hopefully, at the dataset upload stage, you have provided a good system/form that now provides your search engine with great metadata (i.e. we recommend you use the open data catalogue standard DCAT-AP). The concept upload is interchangeable with either machine-to-machine harvester mechanisms, as with open-data, traditional data integration, or manual provision by human upload effort. (D) Enterprise Metadata Repository: here is the persistent storage of data in both data catalogue, index and graph. All things get a persistent ID (how to design persistent URI) and rich metadata.

  • (B) Enrich, refine analyze, and curate. This is the AI part (NLP, Semantics, ML) that enriches the data and datasets, making them smarter. 

Concepts (read also entities, terms, phrases, synonyms, acronyms etc.) from the data sources are found using named entity extraction (NER). By referring to a Knowledge Graph in the Enricher, the appropriate resources are annotated (“tagged”) with the said concept. It does not end here, however. The concept also takes with it from the Knowledge Graph all of the known relationships it has with other concepts.

Essentially a Knowledge Graph is your encoded domain knowledge in a connected graph format. It is by reading these encoded relationships that the machine “understands” the meaning or aboutness of data.

This opens up a very nice Pandora’s box for your search (understanding query intent) and for your Graphical User Interface (GUI) as your data becomes smarter now through your ability to exploit the relationships and connections (semantics and context) between concepts.

You and AI can have a symbiotic relationship in the development of your Knowledge Graph. AI can suggest new concepts and relationships as new data is added. It is, however, you and your colleagues who determine the of concepts/relationships in the Knowledge Graph – concepts/relationships that are important to your department or business. Remember you can utilise more than one knowledge graph, or part of one, for a particular business need(s) or data source(s). The Knowledge Graph is a flexible expression of your business/information models that give structure to all your data and its access.

Extra optional step: If you can manage not only to index the dataset metadata but the datasets themselves, you can make your Pandora’s box even nicer. Those cryptic/nonsensical field names that your traditional database experts love to create can also be incorporated and mapped (one time only!) into your Knowledge Graph, thus increasing the machine “understanding” of the data. Thus, there is a better chance of the data asset being used more widely. 

The configuration of processing with your Knowledge Graph can take care of dataset versioning, lineage and add further specific classifications e.g. data sensitivity, user access and personal information.

Lastly on Processing, your cultural and system interoperability is immensely improved. We’re not talking everyone speaking the same language here, rather everyone talking their language (/culture) and still being able to find the same thing. In this open and FAIR vocabularies further, enrich the meaning to data and your metadata is linked. System interoperability is partially achieved by exploiting the graph of connections that now “sit over” your various data sources.

Controlled Access (Accessible and Reusable)

  • (C) Access, search and visualize APIs. These tools control and influence the delivery, representation, exploration and consumption/use of datasets and data catalogues via a smarter search (made so by smarter data) and a more intuitive Graphical User interface (GUI).

This means your search can now “understand” user intent from just one or two keyword queries (through known relationship connections in the Knowledge Graph). 

Your search now also caters for your searchers who are searching in an unfamiliar subject area or are just having a query off day. Besides offering the standard results page, the GUI can also present related information (again due to the Knowledge Graph), past related user queries, information and question-answer (Q&A) type material. So: search, discovery, learning, serendipity.

Your GUI can also now become more intuitive, changing its information presentation and facets/filters automatically, depending on the query itself (more sustainable front-end coding). 

An alternative to complex scenario coding also includes the possibility for you to create rules (set in your Knowledge Graph) that can control what data users can access (when, how and where) based on their profile, their role, their location, the time and on the device they are using. This same Knowledge Graph can help push and recommend data for certain users proactively. Accessibility will be possible by using standard communication protocols, open access (when possible), authentication where necessary, and always with metadata at hand.

Reusable: your new smart data framework can help increase the time your Data Managers (/Scientists, Analysts) spend using data (and not trying to find it, the 80/20 data science dilemma). It can also help reduce the risk to your AI projects (50% failure rate) by helping searchers find the right data, with its meaning and context, more easily.  Reuse will also be possible with the design that metadata multiple attributes, use licence and provenance in line with community standards

Users and information behaviour (personas)

Users and personas
User groups and services

From experience we have defined the following broad conceptual user-groups:

  • Data Managers, a.k.a. Data Op’s or Data Scientists
    Data Managers are i.e. knowledge engineers, taxonomists and analysts. 
  • Data Stewards
    Data Stewards are responsible for Data Governance, such as data lineage. 
  • Business Professionals/Business end-users
    Business Users may have a diverse background. Hence Business end-users.
  • Actor System are different information systems and applications and services that integrate information via the rich open APIs from the Smart Data Catalogue

The outlined collaborative actors (E-H user groups) and their interplay as information behaviour (personas) with the data (repository) and services (components), together, build the foundation for a more FAIR data management within your organisation, providing for you at the same time, the option to contribute to an even broader shared open FAIR information commons.

  • (E) Data Op’s workplace and dashboard is a combination of tools supporting Data Op’s data management processes in the information behaviours: data provision agents, enrichers and developers.
  • (F) Data Governance workplace is the tools to support Data Stewards collaborative data governance work with Data Managers in the information behaviours: data owner.
  • (G) Access, search, visualize APIs, is the user experience to explore, find and interact with the catalogue and data in the information behaviours: searcher and referrer.
  • (H) API, is the set of open APIs to support access to catalogue data for consuming information systems in the information behaviours: referrer (a.k.a. data exchange).

Potential tooling for this smart data framework:

We hope you enjoyed this post and understand the potential benefits such a smart data framework incorporating FAIR data principles can have on your data catalogue, or for that matter, your organisational content or even your data swamps.


In the next post, Toward data-centric solutions with Knowledge Graphs, we talk about Knowledge Graphs (KG) and its non-proprietary RDF semantic web tech, how you can create your KG(s) and the benefits they can bring to your future data landscape.

View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog
View Peter Voisey's LinkedIn profilePeter Voisey

Well-known findability challenges in the AI-hype

Organisations are facing new types of information challenges in the AI-hype. At least the use cases, the data and the technology are different. The recommended approach and the currently experienced findability challenges remains however the same.

Findability is getting worse as the data landscape is changing

As clearly shown in the result of the 2019 Search & Findability Survey, finding relevant information is still a major challenge to most organisations. In the internal context as many as 55% find it difficult or very difficult to find information which brings us back to same levels that we saw in the survey results from 2012 and 2013.

Given the main obstacles that the respondents experience to improve search and findability, this is not very surprising:

  • Lack of resources/staff
  • Lack of ownership/mandate
  • Poor information quality

One reason behind the poor information quality might be the decreasing focus and efforts spent on traditional information management activities such as content life cycle, controlled vocabularies and metadata standards as illustrated in the below diagrams*. In 2015-16 we saw an increase in these activities which made perfect sense since “lack of tags” or “inconsistent tagging” was considered the largest obstacles for findability in 2013-2015. Unfortunately, the lack of attention to these areas don’t seem to indicate that the information quality has improved, rather the opposite.

(*percent working with the noted areas)

A likely reason behind the experienced obstacles and the lack of resources to improve search and findability is a shift of focus in data and metadata management efforts following the rapid restructuring of the data landscape. In the era of digital transformation, attention is rather on the challenge to identify, collect and store the massive amounts of data that is being generated from all sorts of systems and sensors, both within and outside the enterprise. As a result, it is no longer only unstructured information and documents that are hard to find but all sorts of data that are being aggregated in data lakes and similar data storage solutions.

Does this mean that search and findability of unstructured information is no longer relevant? No, but in combination with finding individual documents, the target groups in focus (typically Data Scientists) have an interest in finding relevant and related data(sets) from various parts of the organisation in order to perform their analysis.

Digital (or data-driven) transformation is often focused on utilising data in combination with new technology to reach level 3 and 4 in the below “pyramid of data-driven transformation” (from In search for insight):

This fact is also illustrated by the technology trends that we can see from the survey results and that is presented in the article “What are organisations planning to focus on to improve Search and Findability?”. Two of the most emerging technologies are Natural Language Processing (NLP) and Machine Learning which are both key components in what is often labelled as “AI”. To use AI to drive transformation has become the ultimate goal for many organisations.

However, as the pyramid clearly shows, to realise digital transformation, automation and AI, you must start by sorting out the mess. If not, the mess will grow by the minute, quickly turning the data lake into a swamp. One of the biggest challenges for organisations in realising digital transformation initiatives still lies in how to access and use the right data.  

New data and use cases – same approach and challenges

The survey results indicate that, irrespective of what type of data that you want to make useful, you need to take a holistic approach to succeed. In other words, if you want to get passed the POC-phase and achieve true digital transformation you must consider all perspectives:

  • Business – Identify the business challenge and form a common vision of the solution
  • User – Get to know your users and what it takes to form a successful solution
  • Information – Identify relevant data and make it meaningful and F.A.I.R.*
  • Technology – Evaluate and select the technology that is best fit for purpose
  • Organisation – Establish roles and responsibilities to manage and improve the solution over time

You might recognise the five findability dimensions that was originally introduced back in 2010 and that are more relevant than ever in the new data landscape. The survey results and the experienced obstacles indicate that the main challenges will remain and even increase within the dimensions of information and organisation.

Also, it is important to remember that to create value from information it is not always necessary to aim for the top of the pyramid. In many cases it is enough to extract knowledge and thereby provide better insights and decision support by aggregating relevant data from different sources. Given that the data quality is good enough that is.

*The strategy to get a sustainable data management, implies leaning upon the FAIR Data Principles

  1. Make data Findable, through persistent ID, rich metadata, indexes and combine id+index.
  2. Make data Accessible, through standard communication protocols, open and free protocols and authentication mechanism where necessary and always keep metadata available.
  3. Make data Interoperable, through the use of vocabularies, terminologies, glossaries, use open vocabularies/models and link the metadata.
  4. Finally make data Reusable, by using multiple metadata attributes, set constraints based on licenses, and express provenance to build trusted and quality datasets leaning upon community standards.

Author: Mattias Ellison, Findability Business Consultant