Google Search Appliance (GSA) 6.12 released

Google has released yet another version of the Google Search Appliance (GSA). It is good to see that Google stay active when it comes to improving their enterprise search product! Below is a list of the new features:

Dynamic navigation for secure search

The facet feature, new since 6.8, is still being improved. When filters are created, it is now possible to take in account that they only include secure documents, which the user is authorized to see.

Nested metadata queries

In previous Search Appliance releases there were restrictions for nesting meta tags in search queries. In this release many of those restrictions are lifted.

LDAP authentication with Universal Login

You can configure a Universal Login credential group for LDAP authentication.

Index removal and backoff intervals

When the Search Appliance encounters a temporary error while trying to fetch a document during crawl, it retains the document in the crawl queue and index. It schedules a series of retries after certain time intervals, known as “backoff” intervals. This before removing the URL from the index.

An example when this is useful is when using the processing pipeline that we have implemented for the GSA. GSA uses an external component to index the content, if that component goes down, the GSA will receive a “404 – page does not exist” when trying to crawl and this may cause mass removal from the index. With this functionality turned on, that can be avoided.

Specify URLs to crawl immediately in feeds

Release 6.12 provides the ability to specify URLs to crawl immediately in a feed by using the crawl-immediately attribute. This is a nice feature in order to prioritise what needs to get indexed quickly.

X-robots-tag support

The Appliance now supports the ability to exclude non-html documents by using the x-robots-tag. This feature opens the possibility to exclude non-html documents by using the x-robots-tag.

Google Search Appliance documentation page

The Evolution of Search in Video Media

Search is becoming more and more an infrastructure necessity and in some areas, and for some users, considered a commodity. However, the evolution of new areas for use of search is growing rapidly both on the web and within the enterprises. Google’s recent acquisition of YouTube is giving us one example of new areas. To search in video material is not simple and I believe we have just seen the very early stage of this new technique.

I am participating in an EU funded project – RUSHES. The project is within the 6th framework program. The aim of the project is among other things to develop techniques for automatic content cataloguing and semantic based indexing. So what impact will this have for the end users and search in video ?

Well, they won’t have to go to a category and search under for example “News and politics”, instead the users will be able to use keywords such as “president” and “scandals” to get clips about Nixon and the Watergate saga. The content provider, on the other hand, won’t have to see the video clip in order to annotate and meta tag it, they will just run the video through a “RUSHES” module and the program will handle the rest. These new scenarios in combination with the semantic web (Web 2.0), will enable new possibilities and business opportunities which we have not even dreamt of before! Like search in video!