Graph Search from Down Under

We’ve already written about the new concept called Graph Search, which is being popularized by Facebook. Wouldn’t it be cool if we applied this to the enterprise as well, as I wrote in an earlier blog post on Enterprise Graph Search? That’s what Australian startup company Lumanetix thinks, when they created the SPAR-K graph search engine for the enterprise.

Applied graph search

As seen in the screenshots of the product, the product do queries against relational databases with linked data objects such as Movies linked to People in Casts, or Managers of Departments in an organization. One difference to Facebook graph search is the more Google-like query syntax which is keyword-based where Facebook uses natural language processing to describe specific queries.Graph search applied to the enterprise

It’s exciting to see that the market is picking up speed with new innovations in the enterprise search field, as Lumanetix SPAR-K is an example of.

 

/Christian Ubbesen

Enterprise Graph Search

Facebook will soon launch their new Graph Search to the general public, and it has received a lot of interest lately.

With graph search, the users will be able to query the social graph that millions of people have constructed over the years when friending each other and putting in more and more personal information about themselves and their friends in the vast Facebook database. It will be possible to query for friends of friends who have similar interests as you, and invite them to a party, or to query for companies where people with similar beliefs as you work, and so on and so forth. The information that is already available, will all the sudden become much more accessible through the power of graph search.

How can we bring this to an enterprise search environment? Well, there are lots of graphs in the enterprise as well to query, both social and other types. For example, how about being able to query for people that have been members of a project in the last three years that involved putting a new product successfully to the market. This would be an interesting list of people to know about, if you’re a marketing director that want to assemble a team in the company, to create a new product and make sure it succeeds in the market.

If we dissect graph search, we will find three important concepts:

  1. The information we want to query against don’t only need to be indexed into one central search engine, but also the relations and attributes of all information objects need to be normalized to create the relational graph and have standard attributes to query against. We could use the Open Graph Protocol as the foundation.
  2. We need a parser that take human language and converts it to a formal query language that a search engine understands. We might want to query in different human languages as well.
  3. The presentation of results should be adapted to the kind of information sought for. In Facebook’s example, if you query for people you will get a list of people with their pictures and some relevant personal information in the result list, and if you query for pictures you will get a collage of pictures (similar to the Google image search).

So the recipe to success is to give the information management part of the project a big focus, making sure to create a unified information model of the content to be indexed. Then create a query parser for natural language based on actual user behavior, and the same user studies would also give us information on how to visualize the different result set types.

I believe we will see more of these kind of solutions in the coming years in the enterprise search market, and look forward exploring the possibilities together with our clients.

Impressions of GSA 7.0

Google released Google Search Appliance, GSA 7.0, in early October. Magnus Ebbesson and I joined the Google hosted pre sales conference in Zürich where we had some of the new functionality presented and what the future will bring to the platform. Google is really putting an effort into their platform, and it gets stronger for each release. Personally I tend to like hardware and security updates the most but I have to say that some of the new features are impressive and have great potential. I have had the opportunity to try them out for a while now.

In late November we held a breakfast seminar at the office in Gothenburg where we talked about GSA in general with a focus on GSA 7.0 and the new features. My impression is that the translate functionality is very attractive for larger enterprises, while the previews brings a big wow-factor in general. The possibility of configuring ACLs for several domains is great too, many larger enterprises tend to have several domains. The entity extraction is of course interesting and can be very useful; a processing framework would enhance this even further however.

It is also nice to see that Google is improving the hardware. The robustness is a really strong argument for selecting GSA.

It’s impressive to see how many languages the GSA can handle and how quickly it performs the translation. The user will be required to handle basic knowledge of the foreign language since the query is not translated. However it is reasonably common to have a corporate language witch most of the employees handle.

The preview functionality is a very welcome feature. The fact that it can highlight pages within a document is really nice. I have played around to use it through our Jellyfish API with some extent of success. Below are two examples of usage with the preview functionality.

GSA 7.0 Preview

GSA 7 Preview - Details

A few thoughts

At the conference we attended in Zürich, Google mentioned what they are aiming to improve the built in template in the GSA. The standard template is nice, and makes setting up a decent graphical interface possible for almost no cost.

My experience is however that companies want to do the frontend integrated with their own systems. Also, we tend to use search for more purposes than the standard usage. Search driven intranets, where you build intranet sites based on search results, is an example where the search is used in a different manner.

A concept that we have introduced at Findwise is search as a service. It means that the search engine is a stand-alone product that has APIs that makes it easy to send data to it and extract data from it. We have created our own APIs around the GSA to make this possible. An easy way to extract data based on filtering of data is essential.

What I would like to see in the GSA is easier integration with performing search, such as a rest or soap service for easy integration of creating search clients. This would make it easier to integrate functionality, such as security, externally. Basically you tell the client who the current user is and then the client handles the rest. It would also increase maintainability in the sense of new and changing functionality does not require a new implementation for how to parse the xml response.

I would also like to see a bigger focus of documentation of how to use functionality, previews and translation, externally.

Final words

My feeling is that the GSA is getting stronger and I like the new features in GSA 7.0. Google have succeeded to announce that they are continuously aiming to improve their product and I am looking forward for future releases. I hope the GSA will take a step closer to the search as a service concept and the addition of a processing framework would enhance it even further. The future will tell.

Mobile clients and Enterprise Search – What are the Implications?

As we all know the smartphone user base is growing explosively. According to www.statcounter.com, internet access from handheld mobile devices has doubled yearly since 2009 adding up to 8,5 % of all page views globally in January 2012. And mobile users want to be able to do all the same things that they are able to do on their PC. And that includes access to the company’s Enterprise Search solution!

The benefits of the sales force being able to search for vital customer information before a meeting or for field service personnel being able to find documentation quickly are quite obvious. So how can an organization tweak its search solution in order to provide convenient access for the mobile users? And above all, what will it cost?

Well, to answer the last question first: much less than you think. Providing for the mobile user is mainly about creating a new front end/UI. The main bulk of your search solution remains the same; indexing, metadata structure and content publishing, for instance, remain essentially unaffected.

But you do need to provide a quite different UI in order for the user interaction to work smoothly considering the specific characteristics of the mobile client primarily when it comes to screen size/resolution and text input. But the smartphone also has a lot of features that the PC lacks – it is always available and it knows exactly where you are, it always has a camera, microphone, speaker, possibly a magnetometer and accelerometer and of course a touchscreen with motions like pinching and swiping etc. And many of these features can be quite useful as the following examples prove:

Illustration 1. Google Mobile Voice Search on the iPhone. Courtesy of UX Matters, www.uxmatters.com

  • Google Mobile App for iPhone: in this app, the iPhone senses when the phone is lifted towards the ear and hence knows when to listen for a search command. Since the phone also knows where the user is, a search for “restaurant” automatically generates hits with restaurants in your vicinity.
  • Scanning a Barcode or QR-code: scanning a Barcode or QR-code with your phone is another way of entering a search string. An example could be a product in a store where the customer could open a price-search-engine and scan the QR-code of the product and see where the best price is.

As you can see, there are plenty of opportunities for those who want to be creative. But for the most part, the I/O will still be done via the screen. At UX Matters there is a great article by Greg Nudelman describing the considerations when implementing search for mobile clients and suggestions for various design patterns that can be efficient (see http://www.uxmatters.com/mt/archives/2010/04/design-patterns-for-mobile-faceted-search-part-i.php). I have included a brief summary below together with illustrations courtesy of UX Matters. But first, some general considerations for mobile clients:

  • Use Javascript code to detect what type of device is accessing your search solution and if it is a mobile client you display the mobile interface.
  • Native App or Mobile Web App: Creating a Mobile Web App is easier and cheaper than creating a native App – for one thing you don’t have to create multiple versions for different OS’s (although you still need to test your solution with different browsers/resolutions). Performance wise there isn’t a big difference between Native Apps and Web Apps and mobile browsers are increasingly gaining access to most of the phones hardware as well.
  • Authentication: SSO for mobile web applications works the same as for desktop browsers.  There are also new solutions currently being launched enabling usage of the company’s existing Active Directory infrastructure. One example is Centrify Directcontrol for Mobile enabling a centralized administration within Active Directory of all device security settings, profiles, certificates and restrictions.
  • Use HTML5 instead of FLASH: iPhones don’t support FLASH but HTML5 is a very capable alternative
  • Testing: How the design looks for different resolutions can be tested through various emulators but it is always recommendable to test on a limited set of real smartphones as well.
  • Access needs to be quick and simple: user interaction is more cumbersome on a phone than on a PC. Normally try to avoid solutions that require more than 3 input actions.
  • Menu navigation: links on the right side are normally used to drill down in the menu hierarchy and left up/towards the home screen
  • Gestures: is a very powerful toolbox that can be used in many different ways to create an efficient UI. For example, use “pinch to show more” if you want to expand the summary information of a specific item in the search hit list or “swipe” to expose the metadata (or whatever action you want to assign to that gesture).
  • Be creative: the mobile client is inherently different from a PC, limited in some ways but more powerful in others. So if you just try to adopt design solutions from the PC and fit them into a mobile UI you are missing out on a lot of powerful design solutions that only make sense on a mobile client and you are definitely not giving the users the best possible search experience. Also, since mobile design is still evolving you don’t need to be limited by conventions and expectations as much as on the PC side – make the most of this freedom to be creative!
  • W3C mobile: for more information about mobile web development, see http://www.w3.org/Mobile/ which also includes a validating scheme to assess the readiness of content for the mobile web

Design patterns for mobile UI (with courtesy of Greg Nudelman/UX Matters)

Mobile faceting can be tricky but by using design patterns like “4 Corners”, “Modal Overlays”, “Watermarks” and “Teaser Design” the UI can become both intuitive and easy to learn as well as providing reasonably powerful functionality. As mentioned, these techniques are summaries from an article written by Greg Nudelman for UX Matters. If you are eager to learn more, feel free to check out Greg’s website and his upcoming workshops focused on mobile design http://www.designcaffeine.com/category/workshops/

4 Corners: instead of stealing scarce real estate by adding faceting options directly on the screen together with the search result, semitransparent buttons are available in each corner enabling the user to bring up a faceting menu by tapping in a corner (see illustration 2).

Modal Overlays: the modal overlay is displayed on top of the original page. The modal overlay works well together with the 4 corners design – tapping a corner opens up the overlay containing faceting functions like filtering and sorting (see illustration 2).

Illustration 2. Four Corners and Modal Overlay patterns. Courtesy of UX Matters, www.uxmatters.com

Watermarks: a great technique for guiding users and showing the possibility of using new functions. The watermarks, possibly animated, show a symbol for the available action, for instance arrows indicating that a swiping gesture could be used (see illustration 3).

Full-Page Refinement Options Pattern: gives the user plenty of refinement options to choose from (see illustration 3).

Illustration 3. Two variations of the Watermark pattern and a Refinement Options pattern. Courtesy of UX Matters, www.uxmatters.com

Teaser Design: show part of the next available content so that the user is aware that there is more content available (see illustration 4).

Illustration 4. Teaser design pattern facilitates the discovery of faceted search filters. Courtesy of UX Matters, www.uxmatters.com

Persistent Status Bar: always maintain a persistent status bar containing the search string together with applied filters in the search result page. This helps the user maintain orientation. Note that all of the illustrations above have a persistent status bar.

Conclusion

Although Best Practices for mobile UI design are still evolving, plenty of progress has already been made and there are several solutions and design patterns to choose from depending on the specific circumstances at hand. So an implementation project need not be rocket science, as long as you learn the right tricks…

Bringing enterprise information to the field, readily available in a mobile handset or tablet, will mobilize your employees. The UI requires rethinking as we have seen. And security needs to be addressed properly to avoid having sensitive data compromised. But other than that, you are ready to go!

Searching for Zebras: Doing More with Less

There is a very controversial and highly cited 2006 British Medical Journal (BMJ) article called “Googling for a diagnosis – use of Google as a diagnostic aid: internet based study” which concludes that, for difficult medical diagnostic cases, it is often useful to use Google Search as a tool for finding a diagnosis. Difficult medical cases are often represented by rare diseases, which are diseases with a very low prevalence.

The authors use 26 diagnostic cases published in the New England Journal of Medicine (NEJM) in order to compile a short list of symptoms describing each patient case, and use those keywords as queries for Google. The authors, blinded to the correct disease (a rare diseases in 85% of the cases), select the most ‘prominent’ diagnosis that fits each case. In 58% of the cases they succeed in finding the correct diagnosis.

Several other articles also point to Google as a tool often used by clinicians when searching for medical diagnoses.

But is that so convenient, is that enough, or can this process be easily improved? Indeed, two major advantages for Google are the clinicians’ familiarity with it, and its fresh and extensive index. But how would a vertical search engine with focused and curated content compare to Google when given the task of finding the correct diagnosis for a difficult case?

Well, take an open-source search engine such as Indri, index around 30,000 freely available medical articles describing rare or genetic diseases, use an off-the-shelf retrieval model, and there you have Zebra. In medicine, the term “zebra” is a slang for a surprising diagnosis. In comparison with a search on Google, which often returns results that point to unverified content from blogs or content aggregators, the documents from this vertical search engine are crawled from 10 web resources containing only rare and genetic disease articles, and which are mostly maintained by medical professionals or patient organizations.

Evaluating on a set of 56 queries extracted in a similar manner to the one described above, Zebra easily beats Google. Zebra finds the correct diagnosis in top 20 results in 68% of the cases, while Google succeeds in 32% of them. And this is only the performance of the Zebra with the baseline relevance model — imagine how much more could be done (for example, displaying results as a network of diseases, clustering or even ranking by diseases, or automatic extraction and translation of electronic health record data).

Enterprise Search Stuffed up with GIS

When I browsed through marketing brochures of GIS (Geographic Information System) vendors I noticed that the message is quite similar to search analytics. It refers in general to integration of various separate sources into analysis based on geo-visualizations. I have recently seen quite nice and powerful combination of enterprise search and GIS technologies and so I would like to describe it a little bit. Let us start from the basic things.

Search result visualization

It is quite obvious to use a map instead of simple list of results to visualize what was returned for an entered query. This technique is frequently used for plenty of online search applications especially in directory services like yellow pages or real estate web sites. The list of things that are required to do this is pretty short:

– geoloalization of items  – it means to assign accurate geo coordinates to location names, addresses, zip codes or whatever expected to be shown in the map; geo localization services are given more less for free by Google or Bing maps.

– backgroud map – this is necessity and also given by Google or Bing; there are also plenty of vendors for more specialized mapping applications

– returned results with geo-coordinates  as metadata – to put them in the map

Normally this kind of basic GIS visualisation delivers basic map operations like zooming, panning, different views and additionally some more data like traffic, parks, shops etc. Results are usually pins [Bing] or drops [Google].

Querying / filtering with the map

The step further of integration between search and GIS would be utilizing the map as a tool for definition of search query. One way is to create area of interest that could be drawn in the map as circle, rectangle or polygon. In simple way it could be just the current window view on the map as the area of query. In such an approach full text query is refined to include only results belonging to area defined.

Apart from map all other query refinement tools should be available as well, like date-time sliders or any kind of navigation and fielded queries.

Simple geo-spatial analysis

Sometimes it is important to sort query results by distance from a reference point in order to see all the nearest Chinese restaurant in the neighborhood.  I would also categorize as simple geo-spatial analysis grouping of search result into a GIS layers like e.g. density heatmap, hot spots using geographical and other information stored in results metadata etc.

Advanced geo-spatial analysis

More advance query definition and refinement would involve geo-spatial computations. Basing on real needs it could be possible for example to refine search results by an area of sight line from a picked reference point or select filtering areas like those inside specific borders of cities, districts, countries etc.

So the idea is to use relevant output from advanced GIS analysis as an input for query refinement. In this way all the power of GIS can be used to get to the unstructured data through a search process.

What kind of applications do you think could get advantage of search stuffed with really advanced GIS? Looking forward to your comments on this post.

Search Driven Navigation and Content

In the beginning of October I attended Microsoft SharePoint Conference 2011 in Anaheim, USA. There were a lot of interesting and useful topics that were discussed. One really interesting session was Content Targeting with the FAST Search Web Part by Martin Harwar.

Martin Harwar talked about how search can be used to show content on a web page. The most common search-driven content is of course the traditional search. But there are a lot more content that can be retrieved by search. One of them is to have search-driven navigation and content. The search-driven navigation means that instead of having static links on a page we can render them depending on the query the user typed in. If a user is for example on a health care site and had recently done a search on “ear infection” the page can show links to ear specialist departments. When the user will do another search and returns to the same page the links will be different.

In the same way we can render content on the page. Imagine a webpage of a tools business that on its start page has two lists of products, most popular and newest tools. To make these lists more adapted for a user we only want show products that are of interest for the user. Instead of only showing the most popular and newest tools the lists can also be filtered on the last query a user has typed. Assume a user searches on “saw” and then returns to the page with the product lists. The lists will now show the most popular saws and the newest saws. This can also be used when a user finds the companies webpage by searching for “saw” on for instance Google.

This shows that search can be used in many ways to personalize a webpage and thereby increase Findability.

Book Review: Search Analytics for Your Site

Lou Rosenfeld is the founder and publisher of Rosenfeld Media and also the co-author (with Peter Morville) of the best-selling book Information architecture for the World Wide Web, which is considered one of the best books about information management.

In Lou Rosenfeld’s latest book he lets us know how to successfully work with Site Search Analytics (SSA). With SSA you analyse the saved search logs of what your users are searching for to try to find emerging patterns. This information can be a great help to figure out what users want and need from your site.  The search terms used on your site will offer more clues to why the user is on your site compared to search queries from Google (which reveal how they get to your site).

So what’s in the book?

Part I – Introducing Site Search Analytics

In part one the reader gets a great example of why to use SSA and an introduction to what SSA is. In the first chapters you follow John Ferrara who worked at a company called Vanguard and how he analysed search logs to prove that a newly bought search engine performed poorly whilst using the same statistics to improve it. This is a great real world example of how to use SSA for measuring quality of search AND to set up goals for improvement.

a word cloud is one way to play with the data

Part II – Analysing the data

In this part Lou gets hands on with user logs and lets you how to analyse the data. He makes it fun and emphasizes the need to play with user data. Without emphasis on playing, the task to analyse user data may seem daunting. Also, with real world examples from different companies and institutions it is easy to understand the different methods for analysis. Personally, I feel the use of real data in the book makes the subject easier (and more interesting) to understand.

From which pages do users search?

Part III – Improving your site

In the third part of the book, Rosenfeld shows how to apply your findings during your analysis. If you’ve worked with SSA before most of it will be familiar (improving best bets, zero hits, query completion and synonyms) but even for experienced professionals there is good information about how to improve everything from site navigation to site content and even to connect your ssa to your site KPI’s.

ConclusionSearch Analytics For Your Site shows how easy it is to get started with SSA but also the depth and usefulness of it. This book is easy to read and also quite funny. The book is quite short which in this day and age isn’t negative. For me this book reminded me of the importance of search analytics and I really hope more companies and sites takes the lessons in this book to heart and focuses on search analytics.

Google Search Appliance (GSA) 6.12 released

Google has released yet another version of the Google Search Appliance (GSA). It is good to see that Google stay active when it comes to improving their enterprise search product! Below is a list of the new features:

Dynamic navigation for secure search

The facet feature, new since 6.8, is still being improved. When filters are created, it is now possible to take in account that they only include secure documents, which the user is authorized to see.

Nested metadata queries

In previous Search Appliance releases there were restrictions for nesting meta tags in search queries. In this release many of those restrictions are lifted.

LDAP authentication with Universal Login

You can configure a Universal Login credential group for LDAP authentication.

Index removal and backoff intervals

When the Search Appliance encounters a temporary error while trying to fetch a document during crawl, it retains the document in the crawl queue and index. It schedules a series of retries after certain time intervals, known as “backoff” intervals. This before removing the URL from the index.

An example when this is useful is when using the processing pipeline that we have implemented for the GSA. GSA uses an external component to index the content, if that component goes down, the GSA will receive a “404 – page does not exist” when trying to crawl and this may cause mass removal from the index. With this functionality turned on, that can be avoided.

Specify URLs to crawl immediately in feeds

Release 6.12 provides the ability to specify URLs to crawl immediately in a feed by using the crawl-immediately attribute. This is a nice feature in order to prioritise what needs to get indexed quickly.

X-robots-tag support

The Appliance now supports the ability to exclude non-html documents by using the x-robots-tag. This feature opens the possibility to exclude non-html documents by using the x-robots-tag.

Google Search Appliance documentation page

Design Principles for Enterprise Search – The Philosophy of UX

In May I attended An Event Apart in Boston (AEA). AEA is a 2-day (design) conference for people who working with websites and was created by the father of web design Jeffrey Zeldman and the CSS guru Eric Meyer. The conference has a broad perspective, dealing with everything from how to write CSS3 and HTML5 to content strategy and graphic design. This post is about an AEA topic brought up by Whitney Hess: Create design principles and use them to establish a philosophy for the user experience.

Hess wants to create universal principals for user experience to communicate a shared understanding amongst team members and customers and to create a basis for an objective evaluation. The principles suggested by Hess are listed below along with examples of how these can relate to search and search user interfaces.

Stay out of people’s way

When you do know what people want stay out of their way

Google knows what to do when people visit their search at Google.com. They get out of the way and make it easy to get things done. The point is not to disturb users with information they do not need, including everything from modal popup windows or to many settings.

Create a hierarchy that matches people’s needs

Give crucial elements the greatest prominence

This means that the most used information should be easy to find and use. A classic example is that on most university webpages – it is almost impossible to find contact details to faculty members or campus address but very easy to find a statement of the school philosophy. But the former is probably what users mostly will try to find.

university website -  xkcd.com/773/

Limit distractions

This principle means that you should design for consecutive tasks and limit related information to the information you know would help the user with her current task. Don’t include related information in a search user interface just because you can if the information does not add value.

Provide strong information scent

There should be enough information in search results for users to decide if results are relevant. In an e-commerce site this would be the difference between selling and not selling. A search result will not be perceived as cluttered if the correct data is shown.

Provide signposts and cues

Always make it clear how to start a new search, how to apply filters and what kind of actions can be applied to specific search results.

Provide context

Let the user know that there are different kinds of search result. Display thumbnails for pictures and videos or show msn availability in people search.

Use constraints appropriately

Prevent errors before they happen. Query suggestion is a good way as it helps users correct spelling error before they happen. This saves time and frustration for the user.

Make actions reversible

Make it obvious how to removes filters or reset other settings.

Provide feedback

Interaction is a conversation so let the user know when something happens or when the search interface fetches new search results. Never let the user guess what happens.

Make a good first impression

You only have one time to make a first impression. It is therefore important to spend time designing the first impression of any interface. Always aim to make the experience for new users better. This could mean voluntary tutorials or fun and good-looking welcome messages.

So now what?

Are universal principles enough? Probably not. Every project and company is different and need their own principles to identify with. Hess ended her presentation with tips on how to create company principles to complement the universal principles. Maybe there will be future blog posts about creating your own design principles.

So what are your company’s principles?