ExternalFileField in Solr

Sometimes we want to update document values in an indexed field more often than other fields. A good solution to this is to use the field type ExternFileField. The ExternalFileField gets values from an external file instead of the index. Such file can easily be changed and update the field after a commit. Hence no documents need to be re-indexed. A field that has ExternalFileField as type is not searchable. The field may currently only be used as a ValueSource in a FunctionQuery.

The external file contains keys and values:


The keys don’t need to be unique.

The name of the external file must be external_<fieldname> or external_<fieldname>.* and must be placed in the index directory.

A new file type of the type ExternalFileField and field must be added to schema.xml.

<fieldType name="file"

           keyField="keyField" defVal="1" indexed="false"

           stored="false" valType="float" />

<field name="<fieldname>" type="file" />

keyField is the field that contains the keys and <fieldname> contains the values from the external file.

valType defines the value type of the field.

At Findwise we have used this method for a customer where we wanted to show the most visited pages higher up in the search result. These statistics are changing daily for a lot of pages and we don’t want to re-index all these pages every day.

Enterprise Search and Business Intelligence?

Business Intelligence (BI) and Enterprise Search is a never ending story

A number of years ago Gartner coined “Biggle” – which was an expression for BI meeting Google. Back then a number of BI vendors, among them Cognos and SAS, claimed that they were working with enterprise search strategically (e.g. became Google One-box partners). Search vendors, like FAST, Autonomy and IBM also started to cooperate with companies such as Cognos. “The Adaptive Warehouse” and “BI for the masses” soon became buzzwords that spread in the industry.

The skeptics claimed that enterprise search never would be good at numbers and that BI would never be good with text.

Since then a lot a lot has happened and today the major vendors within Enterprise Search all claim to have BI solutions that can be fully integrated (and the other way around – BI solutions that can integrate with enterprise search).

The aim is the same now as back then:  to provide unified access to both structured (database) and unstructured (content) corporate information. As FAST wrote in a number of ‘Special Focus’:

“Users should have access to a wide variety of data from just one, simple search interface, covering reports, analysis, scorecards, dashboards and other information from the BI side, along with documents, e-mail and other forms of unstructured information.”

And of course, this seems appealing to customers. But does access to all information really make us more likely to take the right decisions in terms of Business Intelligence. Gartner is in doubt.

Nigel Rayner, research vice president at Gartner Inc, says that:

”The problem isn’t that they (users) don’t have access to information or tools; they already have too much information, and that’s just in the structured BI world. Now you want to couple it with unstructured data? That’s a whole load of garbage coming from the outside world”.

But he also states that search can be used as one part of BI:

“Part of the problem with traditional BI is that it’s very focused on structured information. Search can help with getting access to the vast amount of structured information you have”

Looking at the discussions going on in forums, in blogs and in the research domain most people seem to agree with Gartner’s view: enterprise search and business intelligence makes a powerful combination, but the integrations needs to be made with a number of things in mind:

Data quality

As mentioned before, if one wants to make unstructured and structured information available as a complement to BI it needs to be of a good quality. Knowing that the information found is the latest copy and written by someone with knowledge of the area is essential. Bad information quality is a threat to an Enterprise Search solution, to a combined BI- and search solution it can be devastating. Having Content Lifecycles in place (reviewing, deleting, archiving etc) is a fundamental prerequisite.

Data analysis

Business Intelligence in traditionally built on pre-thought ideas of what data the users need, whereas search gives access to all information in an ad-hoc manner. To combine these two requires a structured way of analyzing the data. If the unstructured information is taken out of its context there is a risk that decisions are built on assumptions and not fact.

BI for the masses?

The old buzzwords are still alive, but the question mark remains. If one wants to give everyone access to BI-data it has to be clear what the purpose is. Giving people a context, for example combining the latest sales statistics with searches for information about the ongoing marketing activities serves a purpose and improves findability. Just making numbers available does not.

enterprise search and business intelligence dashboard

Business intelligence and enterprise search in a combined dashboard – vision or reality within a near future?

So, to conclude: Gartner’s vision of “Biggle” is not yet fulfilled. There are a number of interesting opportunities for the business to create findability solutions that combines business intelligence and enterprise search, but the strategies for adopting it needs to be developed in order to create the really interesting cases.

Have you come across any successful enterprise search and business intelligence integrations? What is your vision? Do you think the integration between the two is a likely scenario?

Please let us know by posting your comments.

It’s soon time for us to go on summer vacation.

If you are Swedish, Nicklas Lundblad from Google had an interesting program about search (Sommar i P1) the other day, which is available as a podcast.

Have a nice summer all of you!