Update on The Enterprise Search and Findability Survey

A quick update on the status of the Enterprise Search survey.

We now have well over a hundred respondents. The more respondents the better the data will be, so please help spreading the word. We’d love to have  several hundred more. The survey will now be open until the end of April.

But most important of all, if you haven’t already, have a cup of coffee and fill in the survey.

A Few Results from the Survey about Enterprise Search

More than 60% say that the amount of searchable content in their organizations today are less or far less than needed. And in three years time 85% say that the amount of searchable content in the organisation will increase och increase significantly.

75% say that it is critical to find the right information to support their organizations business goals and success. But the interesting to note is that over 70% of the respondents say that users don’t know where to find the right information or what to look for – and about 50% of the respondents say that it is not possible to search more than one source of information from a single search query.

In this context it is interesting that the primary goal for using search in organisations (where the answer is imperative or signifact) is to:

  • Improve re-use of information and/or knowledge) – 59%
  • Accelerate brokering of people and/or expertise – 55%
  • Increase collaboration – 60%
  • Raise awareness of “What We Know” – 57%
  • and finally to eliminate siloed repositories – 59%

In many organisations search is owned either by IT (60%) or Communication (27%), search has no specified budget (38%) and has less than 1 dedicated person working with search (48%).  More than 50% have a search strategy in place or are planning to have one in 2012/13.

These numbers I think are interesting, but definitely need to be segmented and analyzed further. That will of course be done in the report which is due to be ready in June.

Google Search Appliance (GSA) 6.12 released

Google has released yet another version of the Google Search Appliance (GSA). It is good to see that Google stay active when it comes to improving their enterprise search product! Below is a list of the new features:

Dynamic navigation for secure search

The facet feature, new since 6.8, is still being improved. When filters are created, it is now possible to take in account that they only include secure documents, which the user is authorized to see.

Nested metadata queries

In previous Search Appliance releases there were restrictions for nesting meta tags in search queries. In this release many of those restrictions are lifted.

LDAP authentication with Universal Login

You can configure a Universal Login credential group for LDAP authentication.

Index removal and backoff intervals

When the Search Appliance encounters a temporary error while trying to fetch a document during crawl, it retains the document in the crawl queue and index. It schedules a series of retries after certain time intervals, known as “backoff” intervals. This before removing the URL from the index.

An example when this is useful is when using the processing pipeline that we have implemented for the GSA. GSA uses an external component to index the content, if that component goes down, the GSA will receive a “404 – page does not exist” when trying to crawl and this may cause mass removal from the index. With this functionality turned on, that can be avoided.

Specify URLs to crawl immediately in feeds

Release 6.12 provides the ability to specify URLs to crawl immediately in a feed by using the crawl-immediately attribute. This is a nice feature in order to prioritise what needs to get indexed quickly.

X-robots-tag support

The Appliance now supports the ability to exclude non-html documents by using the x-robots-tag. This feature opens the possibility to exclude non-html documents by using the x-robots-tag.

Google Search Appliance documentation page

KMWorld 2010 Reflections: Search is a Journey Not a Destination

Two weeks ago me, Ludvig Johansson and Christopher Wallström attended KMWorlds quadruple conference in Washington D.C. The conference consisted of four different conferences; KMWorld, Enterprise Search Summit, Taxonomy Bootcamp and SharePoint Symposium. I focused on Enterprise Search Summit and SharePoint Symposium and Christopher mainly covered Taxonomy Bootcamp as well as the Enterprise Search Summit. (Christopher will soon write a blog post about this as well.)

During the conferences there where some good quality content, however most of it was old news with speakers mainly focusing on outputs of their own products. This was disappointing since I had hoped to see the newest and coolest solutions within my area. Speakers presented systems from their corporations, where the newest and coolest functionality they described was shallow filters on a Google Search Appliance. From my perspective this is not new or cool. I would rather consider this standard functionality in today’s search solutions.

However, some sessions where really good. Daniel W. Rasmus talked about the Evolution of Search in quite a fun and thoughtful way. One thing he wanted to see in the near future was more personalization of search. Search needs to know the user and adapt to him/her and not simply use a standardized algorithm. As Rasmus sad it: “my search engine is not that in to me”. This is, as I would put it, spot on how we see it at Findwise. Today’s customer wants standard search with components that have existed for years now. It’s time for search to take the next step in the evolution and for us to start deliver Findabillity solutions adapted to your needs as an individual. In the line of this, Rasmus ended with another good quote: “Don’t let your search vendors set your exceptions to low”. I think this speaks for it self more or less. If we want contextual search then we should push the vendors out there to start deliver!

Another good session was delivered by Ellen Feaheny on how to utilize both old and new systems smarter. It was from this session the title of this post origins, “It’s a journey not a destination”. I thought this sums up what we feel everyday in our projects. It’s common that customers want to see projects to have a clear start and end. However with search and Findability we see it as a journey. I can even go as far to say it’s a journey without an end. We have customers coming and complaining about their search; saying “It doesn’t work anymore” or “The content is old”, to give two examples. The problem is that search is not a one time problem that you solve and then never have to think about again. If you don’t work with your search solution and treat search as a journey, continually improve relevance, content and invest time in search analytics your solution will soon get dusty and not deliver what your employees or customers wants.

Search is a journey not a destination.

OmniFind Enterprise Edition 9.1 – New Capabilities Discussed Over Breakfast

During the last year a number of interesting things has happened to IBM’s search platform and the new version, OmniFind 9.1, was released this summer. Apart from a large number of improvements in the interface, the change to basing the new solution on open source (Lucene) has proven to be a genius by-pass of some of OmniFinds previous shortcomings.

The licensing model is still quite complicated, something Stephen E Arnold highlighted earlier this year. Since a number of our customers have chosen to take a closer look at OmniFind as a search solution we decided to host a breakfast seminar together with IBM last Thursday, in order to discuss the new features and show how some of our customer are working with it.

Without a doubt, the most interesting part is always to discuss how the solution can be utilized for intranets, extranets, external sites and e-business purposes.

Apart from this, we also took a look at some of the new features:
Type ahead (query suggestion), based on either search statistics or indexed content

Type ahead

Faceted search i.e. the ability to filter on dates, locations, format etc as well as numeric and date range. The later is of course widely used within e-business.

Facets for e-business

Thumbnail views of documents (yes, exactly what it sounds like: a thumbnail view for first page of documents in results page)

Thumbnail of a document

Search analytics in OmniFind 9.1 holds a number of interesting statistic capabilities. Some things worth mentioning is number of queries, query popularity, number of users, average response time (ms) and worst response time (ms).

Save searches (to be able to go back and see if new information has been included), search within result sets (to further narrow your result set within a given result set) and did-you-mean functionality (spell checking) are also included.

..and improvements on the administrator side, just to mention a few:

  • Ability to change the relevancy i.e. to adjust and give certain types of information higher ranking
  • Support for incremental indexing i.e. to only re-index the information that is new or changed since the last time you made it searchable

To conclude: IBM is making a whole lot of improvements in the new version, which are worth taking a closer look at. During the spring we are running upgrading projects for some of our customers, and we will keep you up-to-date with the different application areas OmniFind Enterprise Edition 9.1 is being used for. Please let us know if you have any particular questions or have areas that you are interested in.

Information Flow in VGR

The previous week Kristian Norling from VGR (Västra Götaland Regional Council) posted a really interesting and important blog post about information flow. Those of you who doesn’t know what VGR has been up to previously, here is a short background.

For a number of years VGR has been working to give reality to a model for how information is created, managed, stored and distributed. And perhaps the most important part – integrated.

Information flow in VGR

Why is Information Flow Important?

In order to give your users access to the right information it is essential to get control of the whole information flow i.e. from the time it is created until it reaches the end user. If we lack knowledge about this, it is almost impossible to ensure quality and accuracy.

The fact that we have control also gives us endless possibilities when it comes to distributing the right information at the right time (an old cliché that is finally becoming reality). To sum up: that is what search is all about!

When information is being created VGR uses a Metadata service which helps the editors to tag their content by giving keyword suggestions.

In reality this means that the information can be distributed in the way it is intended. News are for example tagged with subject, target group and organizational info (apart from dates, author, expiring date etc which is automated) – meaning that the people belonging to specific groups with certain roles will get the news that are important to them.

Once the information is tagged correctly and published it is indexed by search. This is done in a number of different ways: by HTML-crawling, through RSS, by feeding the search engine or through direct indexing.

The information is after this available through search and ready to be distributed to the right target groups. Portlets are used to give single sign-on access to a number of information systems and template pages in the WCM (Web Content Management system) uses search alerts to give updated information.

Simply put: a search alert for e.g. meeting minutes that contains your department’s name will give you an overview of all information that concerns this when it is published, regardless of in which system it resides.

Furthermore, the blog post describes VGRs work with creating short and persistent URL:s (through an URL-service) and how to ”monitor” and “listen to” the information flow (for real-time indexing and distribution) – areas where we all have things to learn. Over time Kristian will describe the different parts of the model in detail, be sure to keep an eye on the blog.

What are your thoughts on how to get control of the information flow? Have you been developing similar solutions for part of this?

Search Driven Portals – Personalizing Search

To stay in the front edge within search technology, Findwise has a focus on research, both in the form of larger research projects and with different thesis projects. Mohammad Shadab and I just finished our thesis work at Findwise, where we have explored an idea of search user interfaces which we call search driven portals. User interfaces are mostly based on analysis of a smaller audience but the final interface is then put in production which targets a much wider range of users. The solution is in many cases static and cannot easily be changed or adapted. With Search driven portals, which is a portlet based UI, the users or administrators can adapt the interface specially designed to fulfill the need for different groups. Developers design and develop several searchlets (portlets powered by search technology), where every searchlet provides a specific functionality such as faceted search, results list, related information etc. Users can then choose to add the searchlets with functionality that suits them into their page on a preferred location. From architectural perspective, searchlets are standalone components independent from each other and are also easy to reuse.

Such functionality includes faceted search which serves as filters to narrow a search. These facets might need to be different based on what kind of role, department or background users have. Developers can create a set of facets and let the users choose the ones that satisfy their needs. Search driven portals is a great tool to make sure that sites don’t get flooded with information as new functionalities are developed. If a new need evolves, or if the provider comes with new ideas, the functionality is put into new searchlets which are deployed into the searchlet library. The administrator can broadcast new functionality to users by putting new searchlets on the master page, which affects every user’s own site. However, the users can still adjust new changes by removing the new functionality provided.

Search driven portals opens new ways of working, both in developer and usage perspective. It is one step away from the one size fits all concept, which many sites is supposed to fulfill. Providers such as Findwise can build a large component library which can be customized into packages for different customers. With help of the searchlet library, web administrators can set up designs for different groups, project managers can set up a project adjusted layout and employees can adjust their site after their own requirements. With search-driven portals, a wider range of users needs can more easily be covered.

The Business Case for Enterprise Search

1. Achieve higher employee efficiency levels by providing company-wide, swift access to relevant information

Every business day, employees need to access information stored in various enterprise applications and databases. Enterprise Search addresses this need by providing your co-workers with swift access to relevant information and by consolidating, ranking and presenting it properly. The value proposition of enterprise search is thus to promote core business by enabling co-workers to work more efficiently, to avoid redoing work done elsewhere and to produce better quality as the information they need can be found through one single search solution.

2. Make more money by providing revenue-driving business processes with tailored means to access and act on information

The larger the corporation, the more different information access needs. Besides providing large user groups with general access to corporate information, an Enterprise Search solution can be tailored to meet the specific needs of revenue-driving business processes such as solution sales, business intelligence, patent management and mergers and acquisitions. There might not be that many people working in these areas, but the outcome of their work can have a tremendous impact on the bottom line of your company.

3. Leverage the hidden value of existing IT investments

The return on investment of Enterprise Search is not only a matter of getting your money’s worth for the license and deployment costs of the Enterprise Search solution. As the solution makes all the information hidden in document repositories findable through one search solution, the Enterprise Search solution will in fact help you get a return on investment on content management investments already made.

4. Lower your IT costs by centralizing access to information

Reduce your license, maintenance and support costs by providing one centralized Enterprise Search platform to handle all information access requests. Most companies store information in various information systems such as intranets and web sites, collaboration portals, document management systems, CRM and ERP systems and many other enterprise applications and databases. A typical set-up is to have separate search tools for each of these systems. By using your Enterprise Search platform as a service, you can replace these siloed search functions with one centrally monitored platform that provides search to each of these applications. In this way, you can reduce the annual costs on licenses, maintenance and support for separate search applications.

High Expectations to Googlify the Company = Findability Problem?

It is not a coincidence that the verb “to google” has been added to several renowned dictionaries, such as those from Oxford and Merriam-Webster. Search has been the de facto gateway to the Web for some years now. But when employees turn to Google on the Web to find information about the company they work for, your alarm bells should be ringing. Do you have a Findability problem within the firewall?

The Google Effect on User Expectations

“Give us something like Google or better.”

 

“Compared to Google, our Intranet search is almost unusable.”

 

“Most of the time it is easier to find enterprise information by using Google.”

The citations above come from a study Findwise conducted during 2008-2009 for a customer, who was on the verge of taking the first steps towards a real Enterprise Search application. The old Intranet search tool had become obsolete, providing access to a limited set of information sources only and ranking outdated information over the relevant documents that were in fact available. To put it short, search was causing frustration and lots of it.

However, the executives at this company were wise enough to act on the problem. The goal was set pretty high: Everybody should be able to find the corporate information they need faster and more accurately than before. To accomplish this, an extensive Enterprise Search project was launched.

This is where the contradiction comes into play. Today users are so accustomed to using search as the main gateway to the Web, that the look and feel of Google is often seen as equal to the type of information access solution you need behind the firewall as well. The reasons are obvious; on the Web, Google is fast and it is relevant. But can you—and more importantly should you—without question adopt a solution from the Web within the firewall as well?

Enterprise Search and Web Search are different

  1. Within the firewall, information is stored in various proprietary information systems, databases and applications, on various file shares, in a myriad of formats and with sophisticated security and version control issues to take into account. On the Web, what your web crawler can find is what it indexes.
  2. Within the firewall, you know every single logged in user, the main information access needs she has, the people she knows, the projects she is taking part in and the documents she has written. On the Web, you have less precise knowledge about the context the user is in.
  3. Within the firewall, you have less links and other clear inter-document dependencies that you can use for ranking search results. On the Web, everything is linked together providing an excellent starting point for algorithms such as Google’s PageRank.

Clearly, the settings differ as do user needs. Therefore, the internal search application will be different from a search service on the web; at least if you want it to really work as intended.

Start by Setting up a Findability Strategy

When you know where you are and where you want to be in terms of Findability—i.e. when you have a Findability strategy—you can design and implement your search solution using the search platform that best fits the needs of your company. It might well be Google’s Search Appliance. Just do not forget, the GSA is a totally different beast compared to the Google your users are accustomed to on the Web!

References

http://en.wikipedia.org/wiki/Googling