Speaking about Search as a Service @ PROMISE Technology Transfer day, want to meet up?

Tomorrow morning I leave Gothenburg to attend the PROMISE Technology Transfer day @ CeBIT 2013 in Hanover, Germany.

The event is a workshop introducing its participants to methodologies for the systematic evaluation and monitoring of search engines, and for discussing future trends and requirements for the next generation of information access systems. In other words, it is right up our alley at Findwise.

As Director of Research at Findwise I will speak about Search as a Service. If you are at the event or just nearby I would be happy to meet up and have a chat.  I will be around from Tuesday March 5 until Thursday March 7. Feel free to email me, henrik.strindberg@findwise.com or give me a call at +46709443905.

Hope to see you there!

Better Search Engines and Information Practices in Digital Workplaces

During this year I have worked on a research project that aims to facilitate the development and implementation of an enterprise search engine. By understanding the use and value of information at the digital workplaces, we hope to create even better preconditions for optimizing a search engine to the requirements of a specific organization.

We use a work-task based research approach where we study information practices – that is, the normalized ways we use to recognize information needs, look for information, and how it is valued and used. By studying such practices in real-life work tasks, we can outline the role that a search engine plays in relation to other work tasks as well as to other ways of finding information. In short, being engaged in a creativity-oriented work task initiates different types of information practices compared to the practices we use in everyday, routine-based work tasks …

The creativity-oriented work tasks involve a dimension of innovation, and concepts such as learning and development are often used to describe these activities. Uncertainty is something that is associated with curiosity and may be seen as a driving force behind information seeking. Information that is rich in nuances and that offers different, even contradictory explanations or descriptions is usually appreciated, and the task outcome is only vaguely discerned at first. Routine-oriented tasks, on the other hand, are focused on increasing effectiveness and reducing uncertainty as quickly as possible in the task outcome, which itself may be sketched out relatively clearly from the beginning. Information seeking is often directed to readily available facts. All this means that a search engine must support a variety of information practices at any given workplace!

The “we” in this project is myself together with a Findwise colleague Henrik Strindberg. The project is financially supported by the Swedish Foundation for Strategic Research, and while I am not working with the present project I am employed by the University of Borås.

Just now I am finalizing a presentation of the project for the ICKM conference in Pittsburgh, PA, USA, next week. The presentation is entitled “Interrelated use and value of information sources”, and will be available through the conference proceedings in due time.

Very exciting … and while there I will also attend the board meetings of the ASIS&T’s Board of Directors as a newly appointed Director-at-Large. Very exciting, too!

The 73rd Annual Meeting of ASIS&T focuses on “Navigation Streams in an Information Ecosystem”.

Google Instant – Can a Search Engine Predict What We Want?

On September 8th Google released a new feature for their search engine: Google instant.
If you haven’t seen it yet, there is an introduction on Youtube that is worth spending 1:41 minutes on.

Simply put, Google instant is a new way of displaying results and helping users find information faster. As you type, results will be presented in the background. In most cases it is enough to write two or three characters and the results you expect are already right in front of you.

Google instant

The Swedish site Prisjakt has been using this for years, helping the users to get a better precision in their searches.

At Google you have previously been guided by “query suggestion” i.e. you got suggestions of what others have searched for before – a function also used by other search engines such as Bing (called Type Ahead). Google instant is taking it one step further.

When looking at what the blog community has to say about the new feature it seems to split the users in two groups; you either hate it or love it.

So, what are the consequences? From an end-user perspective we will most likely stop typing if something interesting appears that draws our attention. The result?
The search results shown at the very top will generate more traffic , it will be more personalized over time and we will most probably be better at phrasing our queries better.

From an advertising perspective, this will most likely affect the way people work with search engine optimization. Some experts, like Steve Rubel, claims Google instant will make SEO irrelevant, wheas others, like Matt Cutts think it will change people behavior in a positive way over time  and explains why.

What Google is doing is something that they constantly do: change the way we consume information. So what is the next step?

CNN summarizes what the Eric Schmidt, the CEO of Google says:

“The next step of search is doing this automatically. When I walk down the street, I want my smartphone to be doing searches constantly: ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ”

Schmidt said at the IFA consumer electronics event in Berlin, Germany, this week.

“This notion of autonomous search — to tell me things I didn’t know but am probably interested in — is the next great stage, in my view, of search.”

Do you agree? Can we predict what the users want from search? Is this the sort of functionality that we want to use on the web and behind the firewall?

Real Time Search in the Enterprise

Real time search is a big fuzz in the global network called Internet. Major search engines like Google and Bing are now providing users with real time search results from Facebook, Twitter, Blogs and other social media sites. Real time search means that as soon as content are created or updated, it is immediately searchable. This might be obvious and seems like a basic requirement, but working with search you know that this is not the case most of the time. Looking inside the firewall, in the enterprise, I dare to say that real time search is far from common. Sometimes content is not changed very frequently so it is not necessary to make it instantly searchable. Though, in many cases it’s the technical architecture that limits a real time search implementation.

The most common way of indexing content is by using a web crawler or a connector. Either way, you schedule them to go out and fetch new/updated/deleted content at specific interval during the day. This is the basic architecture for search platforms these days. The advantage of this approach is that the content systems does not need to adapt to the search platform, they just deliver content through their ordinary API:s during indexing. The drawback is that new or updated content is not available until next scheduled indexing. Depending on the system this might take several hours. Due to several reasons, mostly performance, you do not want to schedule connectors or web crawlers to fetch content too often. Instead, to provide real time search you have to do the other way around; let the content system push content to the search platform.

Most systems have some sort of event system that triggers an event when content is created/updated/deleted. Listening for these events, the system can send the content to the search platform at the same time it’s stored in the content system. The search platform can immediately index the pushed content and make it searchable. This requires adaptation of the content system towards the search platform. In this case though, I think the advantages outweighs the disadvantages. Modern content systems of today are (or should be) providing a plug-in architecture so you should fairly easy be able to plug in this kind of code. These plug-ins could also be provided by the search platform vendors just as ordinary connectors are provided today.

Do you agree, or have I been living in a cave for the past years? I’d love to hear you comments on this subject!

Relevance is Important – and Relevant

A couple of weeks ago I read an interesting blog post about comparing the relevance of three different search engines. This made me start thinking of relevance and how it’s sometimes overlooked when choosing or implementing a search engine in a findability solution. Sometimes a big misconception is that if we just install a search engine we will get splendid search results out of the box. While it’s true that the results will be better than an existing database based search solution, the amount of configuration needed to get splendid results is based on how good relevance you get from the start. And as seen in the blog post, it can be quite a bit of different between search engines and relevance is important.

So what is relevance and why does it differ between search engines? Computing relevance is the core of a search engine. Essentially the target is to deliver the most relevant set of results with regards to your search query. When you submit your query, the search engine is using a number of algorithms to find, within all indexed content, the documents or pages that best corresponds to the query. Each search engine uses it’s own set of algorithms and that is why we get different results.

Since the relevance is based on the content it will also differ from company to company. That’s why we can’t say that one search engine has better relevance than the other. We can just say that it differs. To know who performs the best, you have to try it out on your own content. The best way to choose a search engine for your findability solution would thus be to compare a couple and see which yields the best results. After comparing the results, the next step would then be to look at how easy it is to tune the relevance algorithms, to what extent it is possible and how much you need to tune. Based on how good relevance you get from the start you might not need to do much relevance tuning, thus you don’t need the “advanced relevance tuning functionality” that might cost extra money.

In the end, the best search engine is not the one with most functionality. The best one is the one that gives you the most relevant results, and by choosing a search engine with good relevance for your content some initial requirements might be obsolete which will save you time and money.

The ROI of Enterprise Search—Where’s the Beef?

When faced with the large up-front investment of an Enterprise Search installation, executives are asking for proof that the investment will pay up. Whereas it is easy to quantify the value of search on an e-commerce site or as part of the company helpdesk—increased sales, shorter response times—how do you go about verifying that your million-dollar Enterprise Search application has the desired effects on your revenue stream?

Search engines on the Web have changed the landscape of information access. Today, employees are asking for similar search capabilities within the firewall as they are used to having on the Web. Search has become the preferred way of finding information quickly and accurately.

Top executives at large corporations have heard the plea and nowadays see the benefits of efficient Findability. However, it costs to turn the company information overload from a storage problem of the IT department to a valuable asset and business enabler for everybody. So how do you prove the investment worthwhile?

The Effects of Enterprise Search

Before you can prove anything, you need to establish the effects you would like your Enterprise Search solution to have on your organization. Normally, you would want an Enterprise Search solution to:

  •  Enable people to work faster
  •  Enable people to produce better quality
  •  Provide the means for information reuse
  •  Inspire your employees to innovate and invent

These are all effects that a well-designed and maintained Enterprise Search application will help you address. However, the challenge when calculating the return on investment is that you are attempting to have an effect on workflows that are not clearly visible on your revenue stream. There is no easy way to interlink saved or earned dollars to employees being more innovative.

So how do you prove that you are not wasting money?

There are two straightforward ways to address the problem: Studying how users really interact with the Enterprise Search application and asking them how they value it.

User Behavior through Search Logs

By extracting statistics from the logs of your Enterprise Search application, you can monitor how users interact with the tool. There are several statistic measures that can be interesting to look at in order to establish a positive influence on one or more of the targeted effects.

A key performance indicator for calculating if the Enterprise Search application enables people to work faster is to monitor the average ranking of a clicked hit in the result list. If people tend to scroll down the result set before clicking a hit and opening up a document, this implies the application does not provide proper ranking of the results. In other words, users are forced to review the result set, which obviously slows them down.

By monitoring the amount of users that are using the system, by following the number of different documents they open up through search and by observing the complexity of the queries they perform, you can estimate the level of information your users are expecting to find through searching.

If the application is trusted to render relevant, up-to-date results, more users will use it, they will carry out more complex queries and they will open up a wider range of different documents. If your users do not trust the system, however, they will not use it or they will only search for a limited set of simple things such as “news”, “today’s menu” or “accounting office”. If this is the case, you can hardly say your Enterprise Search application has met the requirements posed on it.

Conversely, if the users access a wide set of documents through search and you have a large number of unique users and queries, then this implies your Enterprise Search application is a valued information access tool that promotes information reuse and innovation based on existing corporate knowledge.

User Expectations through Surveys

Another way to collect information for assessing the return of investment of your Enterprise Search initiative is to ask the users what they think. If you ask a representative subset of your intended users how well the Enterprise Search application fits their specific purposes, you will have an estimate of the quality of the application.

There are a lot of other questions you can ask: Does the application help the user to find relevant corporate information? Are the results ranked properly? Does the application help the user to get an overall picture of a topic? Does it enable the user to get new ideas or find new opportunities? Does it help him avoid duplicating work already done elsewhere within the organization?

A Combination of Increased Usage and Perceived Value

As we have seen, the return on investment of an Enterprise Search initiative is often hard to quantify, but the impact such an application has on a set of targeted effects can be measured using search logs and user surveys. The data collected this way provides an estimate of the value of Findability within the firewall of an organization.

Nowadays, hardly anybody questions the marketing value of a good corporate web site or the impact email has on the way we communicate. Such channels and services are self-evident business enablers today. In this respect, the benefits of precise and quick information access within the corporation should be self-evident. The trick is to get the tool just right.

Information Discovery: Search-in-page

Sometimes the users know exactly what they are looking for, sometimes they are just looking to discover new areas. When it comes to information discovery, a plain, one dimensional result list is not the most suitable tool.

Worldwide you’ll find quite a few innovative solutions, some of them mentioned in Findwise’s blog earlier: Quintura and KartOO are two search engines that visualize the clusters of results and the relationships between them, as Clusty that let you discover related topics.  Other examples are projects like Zuula and Dogpile that aggregates results and let you know what you can find in Google, Yahoo, Live, Exalead etc from one single search box – hopefully helping you find new perspectives.

In a few days time Searchinpage, created by entrepreneurs in Sweden, will be available.
Searchinpage let you use any word in the result, mark it and use it as input for a new query. By enabling the users to search instantly, this will hopefully create other ways to explore and discover areas related to your initial query. Searchinpage will be available in a public version and as a special solution for enterprises and organizations with specific needs. The new player seems to have a lot of cards up their sleeves (including linguistic functionality and ideas similar to Zuula and Dogpile) – worth keeping an eye on.

Find People with Spock

Today, Google is the main source for finding information on the web, regardless of the kind of information you’re looking for. Let it be company information, diseases, or to find people – Google is used for finding everything. While Google is doing a great job in finding relevant information, it can be good to explore alternatives that are concentrated upon a more specific target.

In the previous post, Karl blogged about alternatives to Google that provides a different user interface. Earlier, Caroline has enlightened us about search engines that leads to new ways on how to use search. Today I am going to continue on these tracks and tell you a bit about a new challenger, Spock, and my first impressions of using it.

Spock, relased last week in beta version, is a search engine for finding people. Interest in finding people, both celebreties and ordinary people has risen the past years; just look at the popularity of social networking sites such as LinkedIn and Facebook. By using a search engine dedicated to finding people, you get more relevancy in the hits and more information in each hit. Spock crawls the above mentioned sites, as well as a bunch of others to gather the information about people you want to find.

When you begin to use Spock, you instantly see the difference in search results compared to Google. Searching for “Java developer Seattle” in Spock returns a huge list of Java developers positioned in Seattle. With Google, you get a bunch of hiring applications. Searching for a famous person like Steve Jobs with Google, you find yourself with thousands of pages about the CEO of Apple. Using Spock, you will learn that there are a lot of other people around the world also named Steve Jobs. With each hit, you find more information such as pictures, related people, links to pages that the person is mentioned on, etc.

In true Web 2.0 fashion, Spock uses tags to place people into categories. By exploring these tags, you will find even more people that might be of interest. Users can even register on Spock to add and edit tags and information about people.

Over all, Spock seems like a great search engine to me. The fact that users can contribute to the content, a fact that has made Wikipedia to what it is today, combined with good relevancy and a clean interface it has a promising future. It also shows how it is possible to compete with Google and the other giants at the search market by focusing on a specific target and deliver an excellent search experience in that particular area.

Interesting New Search Features

Out on the web there are a large number of small search engines that try to stand out and maybe take some of the market shares from Google. Many of them have interesting search features.

I would like to introduce some of them in order to help other realize that search can (and should) be a bit more then a search bar and a list of hits. A number of these alternative search engines have focused on the visual presentation of the search result in interesting ways. For example the search engine quintura uses tag clouds of related terms and concepts to the original query.

A slightly different approach has been taken by mnemomap and webbrain that presents related concepts in a graph instead. The other part is to visually show the divisions of the search results into different categories so they can easily be navigated through but also to give a quick overview of the subject, examples of that can be seen at e.g. mooter and kooltorch. Finally I would also like to mention kartOO that have, in my opinion, gone one step further and even presents the links to the search results with images and icons.

In conclusion one can say that the ability to graphically visualize the search result so that it is possible to get a quick overview of a particular subject can prove to be a very important feature in future search solutions. It would not only help users find what they want to know, but also help them get a better and wider understanding of a particular subject, without forcing them to read through a large chunk of (hopefully) relevant text.

The search result and related concepts can be presented graphically instead. That will also take advantage of the fact that people can take in a lot more information through an image then by reading text. Further it can help the user to easily see if he or she is on the right track and make possible refinements to the query even before any returned document has been read through, thus saving valuable time, which today is more important then ever.