In a previous post, we have explained the continuous crawl, a new feature in SharePoint 2013 that overcomes previous limitations of the incremental crawl by closing the gap between the time when a document is updated and when the change is visible in search. A different concept in this area is event driven indexing.
Content pull vs. content push
In the case of event driven indexing, the index is updated real-time as an item is added or changed. The event of updating the item triggers the actual indexing of that item, i.e. pushes the content to the index. Similarly, deleting an item results in deleting the item from the index immediately, making it unavailable from the search results.
The three types of crawl available in SharePoint 2013, the full, incremental and continuous crawl are all using the opposing method, of pulling content. This action would be initiated by the user or automated to start at a specified time or time intervals.
The following image outlines the two scenarios: the first one illustrates crawling content on demand (as it is done for the full, incremental and continuous crawls) and the second one illustrates event-driven indexing (immediately pushing content to the index on an update).
Pulling vs pushing content
Example use cases
The following examples are only some of the use cases where an event-driven push connector can make a big difference in terms of the time until the users can access new content or newest versions of existing content:
- Be alerted instantly when an item of interest is added in SharePoint by another user.
- Want deleted content to immediately be removed from search.
- Avoid annoying situations when adding or updating a document to SharePoint and not being able to find it in search.
- View real-time calculations and dashboards based on your content.
Findwise SharePoint Push connector
Findwise has developed for its SharePoint customers a connector that is able to do event driven indexing of SharePoint content. After installing the connector, a full crawl of the content is required after which all the updates will be instantly available in search. The only delay between the time a document is updated and when it becomes available in search is reduced to the time it takes for a document to be processed (that is, to be converted from what you see to a corresponding representation in the search index).
Both FAST ESP and Fast Search for SharePoint 2010 (FS4SP) allow for pushing content to the index, however this capability was removed from SharePoint 2013. This means that even though we can capture changes to content in real time, we are missing the interface for sending the update to the search index. This might be a game changer for you if you want to use SharePoint 2013 and take advantage of the event driven indexing, since it actually means you would have to use another search engine, that has an interface for pushing content to the index. We have ourselves used a free open source search engine for this purpose. By sending the search index outside the SharePoint environment, the search can be integrated with other enterprise platforms, opening up possibilities for connecting different systems together by search. Findwise would assist you with choosing the right tools to get the desired search solution.
Another aspect of event driven indexing is that it limits the resources required to traverse a SharePoint instance. Instead of continuously having an ongoing process that looks for changes, those changes come automatically when they occur, limiting the work required to get that change. This is an important aspect, since the resources demand for an updated index can be at times very high in SharePoint installations.
There is also a downside to consider when working with push driven indexing. It is more difficult to keep a state of the index in case problems occur. For example, if one of the components of the connector goes down and no pushed data is received during a time interval, it becomes more difficult to follow up on what went missing. To catch the data that was added or updated during the down period, a full crawl needs to be run. Catching deletes is solved by either keeping a state of the current indexed data, or comparing it with the actual search engine index during the full crawl. Findwise has worked extensively on choosing reliable components with a high focus on robustness and stability.
The push connector was used in projects with both SharePoint 2010 and 2013 and tested with SharePoint 2007 internally. Unfortunately, SharePoint 2007 has a limited set of event receivers which limits the possibility of pure event driven indexing. Also, at the moment the connector cannot be used with SharePoint Online.
You will probably be able to add a few more examples to the use cases for event driven indexing listed in this post. Let us know what you think! And get in touch with us if you are interested in finding more about the benefits and implications of event driven indexing and learn about how to reach the next level of findability.