Google Search Appliance (GSA) 6.12 released

Google has released yet another version of the Google Search Appliance (GSA). It is good to see that Google stay active when it comes to improving their enterprise search product! Below is a list of the new features:

Dynamic navigation for secure search

The facet feature, new since 6.8, is still being improved. When filters are created, it is now possible to take in account that they only include secure documents, which the user is authorized to see.

Nested metadata queries

In previous Search Appliance releases there were restrictions for nesting meta tags in search queries. In this release many of those restrictions are lifted.

LDAP authentication with Universal Login

You can configure a Universal Login credential group for LDAP authentication.

Index removal and backoff intervals

When the Search Appliance encounters a temporary error while trying to fetch a document during crawl, it retains the document in the crawl queue and index. It schedules a series of retries after certain time intervals, known as “backoff” intervals. This before removing the URL from the index.

An example when this is useful is when using the processing pipeline that we have implemented for the GSA. GSA uses an external component to index the content, if that component goes down, the GSA will receive a “404 – page does not exist” when trying to crawl and this may cause mass removal from the index. With this functionality turned on, that can be avoided.

Specify URLs to crawl immediately in feeds

Release 6.12 provides the ability to specify URLs to crawl immediately in a feed by using the crawl-immediately attribute. This is a nice feature in order to prioritise what needs to get indexed quickly.

X-robots-tag support

The Appliance now supports the ability to exclude non-html documents by using the x-robots-tag. This feature opens the possibility to exclude non-html documents by using the x-robots-tag.

Google Search Appliance documentation page

Google Search Appliance (GSA) 6.10 released

Last week, Google released version 6.10 of the software to their Google Search Appliance (GSA).

This is a minor update and the focus at the Google teams has been bug fixes and increased stability. Looking at the release notes, there’s indeed plenty of bugs that has been solved.

However, there are also some new features in this release. Some of the more interesting, in my opinion, are:

Multiple front-end configuration for Dynamic Navigation

Since the 6.8 release, the GSA has been able to provde facets, or Dynamic Navigation as Google calls it. However the facets has been global so you couldn’t have two front ends with different facets. This is now possible.
More feeds statistics and Adjust PageRank in feeds
More statistics of what’s happening with feeds you push into the GSA is a very welcome feature. The possibility to adjus PageRank allows for some more control over relevancy in feeds.

Indexing Crawl time kerberos support and Indexing large files

Google is working hard on security and every release since 6.0 has included some security improvements. Nice to see that it continues. Since beginning, the GSA has simply dropped files bigger than 30 MB. Now it will index larger (you can configure how large), but still only the first 2.5 MB of the content will be indexed.

Stopword lists for differented languages

Scalability Centralized configuration

For a multi-node GSA setup, you can now specify the configuration on the master and it’s propagated to the slaves

For a complete list of new features, see the New and Changed Features page in the documentation

Google Search Appliance Learns What You Want to Find

Analyzing user behaviour is a key ingredient to make a search solutions successful. By using Search Analytics, you gain knowledge of how your users use the search solution and what they expect to find. With this knowledge, simple adjustments such as Key Matches, Synonyms and Query Suggestion can enhance the findability of your search solution.

In addition to this, you can also tune the relevancy by knowing what your users are after. An exciting field in this area is to automate this task, i.e by analyzing what users click on in the search result, the relevancy of the documents it automatically adjusted. Findwise has been looking into this area lately, but there hasn’t been any out-of-the-box functionality for this from any vendor.

Until now.

Two weeks ago Google announced the second major upgrade this year for the Google Search Appliance. Labeled as version 6.2, it brings a lot of new features. The most interesting and innovative one is the Self-Learning Scorer. The self learning scorer analyzes user’s click and behaviour in the search result and use it as input to adjust the relevancy. This means that if a lot of people clicks on the third result, the GSA will boost this document to appear higher up in the result set. So, without you having to do anything, the relevance will increase over time making your Search Solution perform better the more it is used. It’s easy to imagine this will create an upward spiral.

The 6.2 release also delivers improvements regarding security, connectivity, indexing overview and more. To read more about the release, head over to the Google Enterprise Blog.

Try the GSA Virtual Edition

One drawback with the Google Search Appliance (GSA) has been that you cannot test it before you buy it. You could go to a Google Partner and ask them to index your content but that only works well with  public content. If it’s content behind your firewall it gets worse and you most probably have to buy your own GSA just to try it out.

With the Virtual Google Search Appliance (VGSA) you can now try all the GSA functionality before buying it. The VGSA is simply a VMWare image of the GSA software. Simply install it on a regular server, fire up VMWare and you’re good to go! All functionality of the real deal is available, including the connector framework. The only limitation is the index limit of 50,000 documents.

Our customers often wants to do a PoC or Pilot before investing in an enterprise search solution. The VGSA is ideal for this since it’s easy and cheap to get up and running. It’s also great for us partners that we now can have multiple installations to experiment with without buying a lot of hardware.

Read more about the VGSA here!