Findwise is proud to announce that we now have released our first publicly available plugins to the Open Pipeline crawling and document processing framework. A list of all available plugins can be found on the Open Pipeline Plugins page and the ones Findwise have created can be downloaded on our Findwise Open Pipeline Plugins page.
OpenPipeline is an open source software for crawling, parsing, analyzing and routing documents. It ties together otherwise incomplete solutions for enterprise search and document processing. OpenPipeline provides a common architecture for connectors to data sources, file filters, text analyzers and modules to distribute documents across a network. It includes a job scheduler and a full UI with a point-and-click interface.
Findwise have been using this framework in a number of customer projects with great success. It ties particularly good together with Apache Solr, not only because it is open source but most importantly because it fills a hole in functionality that Solr lacks – an easy to use framework for developing document processors and connectors. However we are not using this for Solr only, a number of plugins for the Google Search Appliance have also been made and we have started investigating how Open Pipeline can be integrated with the IBM Omnifind search engine as well.
The best thing with this framework is that it is very flexible and customizable but still easy to use AND, maybe most importantly for me as a developer, easy to work with and develop against. It has a simple yet powerful enough API to handle all that you need. And because it is an open source framework any shortcomings and limitations that we find along the way can be investigated in detail and a better solution can be proposed to the Open Pipeline team for inclusion in future releases.
We have in fact already contributed to the development of the project in a great deal by using it, testing it and by reporting bugs and suggested improvements on their forums. And the response from the team has been very good – some of our suggested improvements have already been included and some are on the way in the new 0.8 version. We are also in the process of further deepening the collaboration by signing a contributors agreement so that we eventually can be able to contribute with code as well.
So how do our customers benefit from this?
First it makes us develop and deliver search and index solutions more quickly and of better quality to our customers. This is because more developers can work with the same framework as a base and the overall code base will be used more, tested more and is thus of better quality. We have also the possibility to reuse good and well tested components so that several customers together can share the costs of development and thus get a better service/product for less money which is always a good thing of course!