Event driven indexing for SharePoint 2013

In a previous post, we have explained the continuous crawl, a new feature in SharePoint 2013 that overcomes previous limitations of the incremental crawl by closing the gap between the time when a document is updated and when the change is visible in search. A different concept in this area is event driven indexing.

Content pull vs. content push

In the case of event driven indexing, the index is updated real-time as an item is added or changed. The event of updating the item triggers the actual indexing of that item, i.e. pushes the content to the index. Similarly, deleting an item results in deleting the item from the index immediately, making it unavailable from the search results.

The three types of crawl available in SharePoint 2013, the full, incremental and continuous crawl are all using the opposing method, of pulling content. This action would be initiated by the user or automated to start at a specified time or time intervals.

The following image outlines the two scenarios: the first one illustrates crawling content on demand (as it is done for the full, incremental and continuous crawls) and the second one illustrates event-driven indexing (immediately pushing content to the index on an update).

Pulling vs pushing content, showing the advantage of event driven indexing

Pulling vs pushing content

Example use cases

The following examples are only some of the use cases where an event-driven push connector can make a big difference in terms of the time until the users can access new content or newest versions of existing content:

  • Be alerted instantly when an item of interest is added in SharePoint by another user.
  • Want deleted content to immediately be removed from search.
  • Avoid annoying situations when adding or updating a document to SharePoint and not being able to find it in search.
  • View real-time calculations and dashboards based on your content.

Findwise SharePoint Push connector

Findwise has developed for its SharePoint customers a connector that is able to do event driven indexing of SharePoint content. After installing the connector, a full crawl of the content is required after which all the updates will be instantly available in search. The only delay between the time a document is updated and when it becomes available in search is reduced to the time it takes for a document to be processed (that is, to be converted from what you see to a corresponding representation in the search index).

Both FAST ESP and Fast Search for SharePoint 2010 (FS4SP) allow for pushing content to the index, however this capability was removed from SharePoint 2013. This means that even though we can capture changes to content in real time, we are missing the interface for sending the update to the search index. This might be a game changer for you if you want to use SharePoint 2013 and take advantage of the event driven indexing, since it actually means you would have to use another search engine, that has an interface for pushing content to the index. We have ourselves used a free open source search engine for this purpose. By sending the search index outside the SharePoint environment, the search can be integrated with other enterprise platforms, opening up possibilities for connecting different systems together by search. Findwise would assist you with choosing the right tools to get the desired search solution.

Another aspect of event driven indexing is that it limits the resources required to traverse a SharePoint instance. Instead of continuously having an ongoing process that looks for changes, those changes come automatically when they occur, limiting the work required to get that change. This is an important aspect, since the resources demand for an updated index can be at times very high in SharePoint installations.

There is also a downside to consider when working with push driven indexing. It is more difficult to keep a state of the index in case problems occur. For example, if one of the components of the connector goes down and no pushed data is received during a time interval, it becomes more difficult to follow up on what went missing. To catch the data that was added or updated during the down period, a full crawl needs to be run. Catching deletes is solved by either keeping a state of the current indexed data, or comparing it with the actual search engine index during the full crawl. Findwise has worked extensively on choosing reliable components with a high focus on robustness and stability.

The push connector was used in projects with both SharePoint 2010 and 2013 and tested with SharePoint 2007 internally. Unfortunately, SharePoint 2007 has a limited set of event receivers which limits the possibility of pure event driven indexing. Also, at the moment the connector cannot be used with SharePoint Online.

You will probably be able to add a few more examples to the use cases for event driven indexing listed in this post. Let us know what you think! And get in touch with us if you are interested in finding more about the benefits and implications of event driven indexing and learn about how to reach the next level of findability.

Continuous crawl in SharePoint 2013

Continuous crawl is one of the new features that comes with SharePoint 2013. As an alternative to incremental crawl, it promises to improve the freshness of the search results. That is, the time between when an item is updated in SharePoint by a user and when it becomes available in search.

Understanding how this new functionality works is especially important for SharePoint implementations where content changes often and/or where it’s a requirement that the content should instantly be searchable. Nonetheless, since many of the new SharePoint 2013 functionalities depend on search (see the social features, the popular items, or the content by search web parts), understanding continuous crawl and planning accordingly can help level the user expectation with the technical capabilities of the search engine.

Both the incremental crawl and the continuous crawl look for items that were added, changed or deleted since the last successful crawl, and update the index accordingly. However, the continuous crawl overcomes the limitation of the incremental crawl, since multiple continuous crawls can run at the same time. Previously, an incremental crawl would start only after the previous incremental crawl had finished.

Limitation to content sources

Content not stored in SharePoint will not benefit from this new feature. Continuous crawls apply only to SharePoint sites, which means that if you are planning to index other content sources (such as File Shares or Exchange folders) your options are restricted to incremental and full crawl only.

Example scenario

The image below shows two situations. In the image on the left (Scenario 1), we are showing a scenario where incremental crawls are scheduled to start at each 15 minutes. In the image on the right (Scenario 2), we are showing a similar scenario where continuous crawls are scheduled at each 15 minutes. After around 7 minutes from starting the crawl, a user is updating a document. Let’s also assume that in this case passing through all the items to check for updates would take 44 minutes.

Continuous crawl SharePoint 2013

Incremental vs continuous crawl in SharePoint 2013

In Scenario 1, although incremental crawls are scheduled at each 15 minutes, a new incremental crawl cannot be started while there is a running incremental crawl. The next incremental crawl will only start after the current one is finished. This means 44 minutes for the first incremental crawl to finish in this scenario, after which the next incremental crawl kicks in and finds the updated document and send it to the search index. This scenario shows that it could take around 45 minutes from the time the document was updated until it is available in search.

In Scenario 2, a new continuous crawl will start at each 15 minutes, as multiple continuous crawls can run in parallel. The second continuous crawl will see the updated document and send it to the search index. By using the continuous crawl in this case, we have reduced the time it takes for a document to be available in search from around 45 minutes to 15 minutes.

Not enabled by default

Continuous crawls are not enabled by default and enabling them is done from the same place as for the incremental crawl, from the Central Administration, from Search Service Application, per content source. The interval in minutes at which a continuous crawl will start is set to a default of 15 minutes, but it can be changed through PowerShell to a minimum of 1 minute if required. Lowering the interval will however increase the load on the server. Another number to take into consideration is the maximum number of simultaneous requests, and this is a configuration that is done again from the Central Administration.

Continuous crawl in Office 365

Unlike in SharePoint 2013 Server, continuous crawls are enabled in SharePoint Online by default but are managed by Microsoft. For those used to the Central Administration from the on-premise SharePoint server, it might sound surprising that this is not available in SharePoint Online. Instead, there is a limited set of administrative features. Most of the search features can be managed from this administrative interface, though the ability to manage the crawling on content sources is missing.

The continuous crawl for Office 365 is limited in the lack of control and configuration. The crawl frequency cannot be modified, but Microsoft targets between 15 minutes and one hour between a change and its availability in the search results, though in some cases it can take hours.

Closer to real-time indexing

The continuous crawl in SharePoint 2013 overcomes previous limitations of the incremental crawl by closing the gap between the time when a document is updated and when this is visible in the search index.

A different concept in this area is the event driven indexing, which we will explain in our next blog post. Stay tuned!

Enterprise Graph Search

Facebook will soon launch their new Graph Search to the general public, and it has received a lot of interest lately.

With graph search, the users will be able to query the social graph that millions of people have constructed over the years when friending each other and putting in more and more personal information about themselves and their friends in the vast Facebook database. It will be possible to query for friends of friends who have similar interests as you, and invite them to a party, or to query for companies where people with similar beliefs as you work, and so on and so forth. The information that is already available, will all the sudden become much more accessible through the power of graph search.

How can we bring this to an enterprise search environment? Well, there are lots of graphs in the enterprise as well to query, both social and other types. For example, how about being able to query for people that have been members of a project in the last three years that involved putting a new product successfully to the market. This would be an interesting list of people to know about, if you’re a marketing director that want to assemble a team in the company, to create a new product and make sure it succeeds in the market.

If we dissect graph search, we will find three important concepts:

  1. The information we want to query against don’t only need to be indexed into one central search engine, but also the relations and attributes of all information objects need to be normalized to create the relational graph and have standard attributes to query against. We could use the Open Graph Protocol as the foundation.
  2. We need a parser that take human language and converts it to a formal query language that a search engine understands. We might want to query in different human languages as well.
  3. The presentation of results should be adapted to the kind of information sought for. In Facebook’s example, if you query for people you will get a list of people with their pictures and some relevant personal information in the result list, and if you query for pictures you will get a collage of pictures (similar to the Google image search).

So the recipe to success is to give the information management part of the project a big focus, making sure to create a unified information model of the content to be indexed. Then create a query parser for natural language based on actual user behavior, and the same user studies would also give us information on how to visualize the different result set types.

I believe we will see more of these kind of solutions in the coming years in the enterprise search market, and look forward exploring the possibilities together with our clients.

Microsoft is betting on cloud, mobile and social for SharePoint 2013 – Impressions from the SharePoint Conference 2012

Over 10,000 attendees from 85 countries, more than 200 sponsors and exhibitors, and over 250 sessions. Besides these impressive numbers, the 2012 SharePoint conference in Las Vegas has also marked the launch of the new version of SharePoint. Findwise was there to learn and is now sharing with you the news about enterprise search in SharePoint 2013.

In the keynote presentation on the first day of the conference, Jared Spataro (Senior Director, SharePoint Product Management at Microsoft) mentions the three big bets made for the SharePoint 2013 product: CLOUD, MOBILE, and SOCIAL. This post tries to provide a brief overview of what these three buzzwords mean for the enterprise search solution in SharePoint 2013. Before reading this, also check out our previous post about search in SharePoint 2013 to get a taste of what’s new in search.

Search in the cloud

While you have probably heard the saying that “the cloud has altered the economics of computing” (Jared Spataro), you might be wondering how to get there. How to go from where you are now to the so-called cloud. The answer for search is that SharePoint 2013 provides a hybrid approach that helps out in this transition. Hybrid search promises to be the bridge between on-premises and the cloud.

The search results from the cloud and those from on-premise can be shown on the same page with the use of the “result blocks”. The result block, new to SharePoint 2013, is a block of results that are individually ranked and are grouped according to a “query rule”. In short, a query rule defines a condition and an action to be fired when the condition is met. With the use of the result blocks, you can display the search results for content coming from the cloud when searching from an on-premises site and the other way around (depending whether you want the search to be one-way or bidirectional), and you can also conditionally enable these result blocks depending on the query (for example, queries matching specific words or regular expressions).

hybridsearch

Screenshot from the post Hybrid search of the Microsoft SharePoint Team Blog showing how results from the cloud are integrated in the search results page when the user searches from an on-premises SharePoint 2013 site.

Before making the decision to move to the cloud, it is wise to check the current features availability for both online and on-premise solutions on TechNet.

Mobile devices

With SharePoint 2013, Microsoft has added native mobile apps for Windows, Windows Phone, iPhone, and iPad, and support across different mobile devices (TechNet), which provides access to information and people wherever the users are searching from.

Also important to mention when talking about mobile, is that the improved REST API widens the extensibility options and allows easy development of custom user experiences across different platforms and devices. The search REST API provides access to the keyword query language parameters, and combining this with a bit of JavaScript and HTML allows developers to quickly start building Apps with custom search experiences and making all information available across devices.

Social search

In the same keynote, Jared Spataro said that Microsoft has “integrated social very deeply into the product, creating new experiences that are really designed to help people collaborate more easily and help companies become more agile.” This was also conveyed by the presence of the two founders of the enterprise social network Yammer in the keynote presentation. The new social features integration means that the information about people following content, people following other people, tags, mentions, posts, discussions, are not only searchable but can be used in improving the relevance of the search results and improving the user experience overall. Also, many of the social features are driven by search, such as the recommendations for people or documents to follow.

Whether you are trying to find an answer to a problem to which the solution has already been posted by somebody else, or whether you are trying to find a person with the right expertise through the people search, SharePoint 2013 provides a more robust and richer social search experience than its previous versions. And the possibilities to extend the out-of-the-box capabilities must be very attractive to businesses that are for example looking to combine the social interactivity inside SharePoint with people data stored in other sources (CRM solutions, file shares, time tracking applications, etc).

Stay tuned!

It was indeed an awesome conference, well organized, but most of the times it was hard to decide which presentation to choose from the many good sessions running at the same time. Luckily (or wisely), we had more than one Findwizard on location!

This post is part of our series of reports from the SharePoint 2012 Conference. Keep an eye on the Findability blog for part two of our report from the biggest SharePoint conference of 2012!

Search in SharePoint 2013

There has been a lot of buzz about the upcoming release of Microsoft’s SharePoint 2013, how about the search in SharePoint 2013? The SharePoint Server 2013 Preview has been available for download since July this year, and a few days ago the new SharePoint has reached Release to Manufacturing (RTM) with general availability expected for the first quarter of 2013.

If you currently have an implementation of SharePoint in your company, you are probably wondering what the new SharePoint can add to your business. Microsoft’s catchphrase for the new SharePoint is that “SharePoint 2013 is the new way to work together”. If you look at it from a tech perspective, amongst other features, SharePoint 2013 introduces a cloud app model and marketplace, a redesign of the user experience, an expansion of collaboration tools with social features (such as microblogging and activity feeds), and enhanced search functionality. There are also some features that have been deprecated or removed in the new product, and you can check these on TechNet.

Let’s skip now to the new search experience provided out-of-the-box by SharePoint 2013. The new product revolves around the user more than ever, and that can be seen in search as well. Here are just a few of the new or improved functionalities. A hover panel to the right of a search result allows users to quickly inspect content. For example, it allows users to preview a document and take actions based on document type. Users can find and navigate to past search results from the query suggestions box, and previously clicked results are promoted in the results ranking. The refiners panel now reflects more accurately the entities in your content (deep refiners) and visual refiners are available out-of-the-box. Social recommendations are powered by users’ search patterns, and video and audio have been introduced as new content types. Some of the developers reading this post will also be happy to hear that SharePoint 2013 natively supports PDF files, meaning that you are not required anymore to install a third-party iFilter to be able to index PDF files!

Search Overview in SharePoint 2013

Search results page in SharePoint 2013 – from the Microsoft Office blog

While the out-of-the-box SharePoint 2013 search experience sounds exciting, you may also be wondering how much customization and extensibility opportunities you have. You can of course search content outside SharePoint and several connectors that allow you to get content from repositories such as file shares, the web, Documentum, Lotus Notes and public Exchange folders are included. Without any code, you can use the query rules to combine user searches with business rules. Also, you can associate result types with custom templates to enrich the user experience. Developers can now extend content processing and enrichment, which previously could have only be achieved using FAST Search for SharePoint. More than that, organizations have the ability to extend the search experience through a RESTful API.

This post does not cover all the functionalities and if you would like to read more about what changes the new SharePoint release brings, you can start by checking the TechNet material and following the SharePoint Team Blog and the Findwise Findability Blog, and then get in touch with us if you are considering implementing SharePoint 2013 in your organization or company.

Findwise will attend the SharePoint Conference 2012 in Las Vegas USA between 12-15 November and this will be a great opportunity to learn more about the upcoming SharePoint. We will report from the conference from a findability and enterprise search perspective. Findwise has years of experience in working with FAST ESP and SharePoint, and is looking forward to discussing how SharePoint 2013 can help you in your future enterprise search implementation.

The Enterprise Search and Findability Report 2012 is ready

No strategy, no budget, no resources. This is the common scenario for enterprise search and findability in many organisations today. Still Enterprise Search is considered a critical success factor in 75% of organisations that responded to the global survey that ran from March to May this year.

The Enterprise Search and Findability Report 2012 is now ready for download.

The Enterprise Search and Findability report 2012 shows that 60% of the respondents expressed that it is very/moderately hard to find the right information. Only 11% stated that it is fairly easy to search for information and as few as 3% consider it very easy to find the desirable information. This shows that there still is a large untapped potential for any organisation to get great value from investing in enterprise search. For a relatively small investment, preferably in personnel it is possible to make search a lot better. The survey also reveals that  organisations who are very satisfied with their search, have a (larger) budget, more resources and systematically work with analysing search.

What is your primary goal for utilising search technology in your organisation?Figure. What is your primary goal for utilising search technology in your organisation?

The primary goal for using search is to accelerate retrieval of known information sources, 91%, and to improve the re-use of content (information/knowledge), 72%. This indicates that often search within organisations is used as a discovery tool for what already is known. If looking over the next three years, as many as 77% think that the amount of information in the organisation will increase. This means that every year it will be even more important be able to find the right information and that means Enterprise search is still very much needed, as stated in the following great presentations (on video):  Why Business Success Depends on Enterprise Search (by Martin White of Intranet Focus) and The Enterprise Search Market – What should be on your radar? (by Alan Pelz-Sharpe of 451 Research)

Download the full report.

Findability day in Stockholm – search trends and customer insights

Last Thursday about 50 of Findwise customers, friends and people from the industry gathered in Stockholm for a Findability day (#findday12). The purpose was simply to share experiences from choosing, implementing and developing search and findability solutions for all types of business and use cases.

Martin White, who has been in the intranet business since 1996, held the keynote speech about “Why business success depends on search”.
Among other things he spoke about why the work starts once search is implemented, how a search team should be staffed and what the top priority areas are for larger companies.
Martin has also published an article about Enterprise Search Team Management  that gives valuable insight in how to staff a search initiative. The latest research note from Martin White on Enterprise search trends and developments.

Henrik Sunnefeldt, SKF, and Joakim Hallin, SEB, were next on stage and shared their experiences from working with larger search implementations.
Henrik, who is program manager for search at SKF, showed several examples of how search can be applied within an enterprise (intranet, internet, apps, Search-as-a-Service etc) to deliver value to both employees and customers.
As for SEB, Joakim described how SEB has worked actively with search for the past two years. The most popular and successful implementation is a Global People Search. The presentation showed how SEB have changed their way of working; from country specific phone books to a single interface that also contains skills, biographies, tags and more.

During the day we also had the opportunity to listen to three expert presentations about Big data (by Daniel Ling and Magnus Ebbeson), Hydra – a content processing framework – video and presentation (by Joel Westberg) and Better Business, Protection & Revenue (by David Kemp from Autonomy).
As for Big data, there is also a good introduction here on the Findability blog.

Niklas Olsson and Patric Jansson from KTH came on stage at 15:30 and described how they have been running their swift-footed search project during the last year. There are some great learnings from working early with requirements and putting effort into the data quality.

Least, but not last, the day ended with Kristian Norling from Findwise who gave a presentation on the results from the Enterprise Search and Findability Survey. 170 respondents from all over the world filled out the survey during the spring 2012 that showed quite some interesting patterns.
Did you for example know that in many organisations search is owned either by IT (58%) or Communication (29%), that 45% have no specified budget for search and 48% of the participants have less than 1 dedicated person working with search?  Furtermore, 44,4% have a search strategy in place or are planning to have one in 2012/13.
The survey results are also discussed in one of the latest UX-postcasts from James Royal-Lawson and Per Axbom.

Thank you to all presenters and participants who contributed to making Findability day 2012 inspiring!

We are currently looking into arranging Findability days in Copenhagen in September, Oslo in October and Stockholm early next spring. If you have ideas (speakers you would like to hear, case studies that you would like insight in etc), please let us know.

Reflections on Search at Intranets 2012 conference

Despite large corporations spending hundreds of millions of euros creating information they spend almost nothing on search, Martin White said at the recent Intranets 2012 conference. But before dealing with this depressing fact, I would like to start on a more positive note.

Being a search professional it was an absolute joy to jump over to the other side of the fence and join the well over hundred intranet professionals at intranets 2012 in gorgeous Sydney. I whole heartedly recommend to search #intranets2012 on twitter to get a feel for the fun, inspiration and knowledge sharing that went on.

With sessions on collaboration, from recognized experts such as Michael Sampson, or by seasoned practitioners such as William Amurgis from American Electric Power, it was clear that social intranets are not only a buzz word but are already providing businesses with great value. Meanwhile James Robertson demanded that we raise the bar for design and usability from providing function to delivering pretty and simple intranets that surprise and delight. Mandy Geddes from Institute of Executive coaching gave me a brilliant idea of how to use private online communities to engage customers.

But in spite of returning from Sydney with a feeling of new energy, eagerness and almost urgency to get back to helping my customers and colleagues, I also realized that search was obviously not on everyone’s mind. Except for Martin White‘s excellent keynote only one session I attended to, Ausgrid Power‘s presentation of their intranet “the grid”, had search as a key area. Hopefully these few glimpses of light sparked something and I honestly think they do, bearing in mind the discussions I had in the breaks and in the fantastic social event Thursday evening.

After writing this to share my thinking, I have two things to say:

Findability ambassadors; our work has only begun and a I hope to see you all at Intranets 2013, because I’m sure going!

Update on The Enterprise Search and Findability Survey

A quick update on the status of the Enterprise Search survey.

We now have well over a hundred respondents. The more respondents the better the data will be, so please help spreading the word. We’d love to have  several hundred more. The survey will now be open until the end of April.

But most important of all, if you haven’t already, have a cup of coffee and fill in the survey.

A Few Results from the Survey about Enterprise Search

More than 60% say that the amount of searchable content in their organizations today are less or far less than needed. And in three years time 85% say that the amount of searchable content in the organisation will increase och increase significantly.

75% say that it is critical to find the right information to support their organizations business goals and success. But the interesting to note is that over 70% of the respondents say that users don’t know where to find the right information or what to look for – and about 50% of the respondents say that it is not possible to search more than one source of information from a single search query.

In this context it is interesting that the primary goal for using search in organisations (where the answer is imperative or signifact) is to:

  • Improve re-use of information and/or knowledge) – 59%
  • Accelerate brokering of people and/or expertise – 55%
  • Increase collaboration – 60%
  • Raise awareness of “What We Know” – 57%
  • and finally to eliminate siloed repositories – 59%

In many organisations search is owned either by IT (60%) or Communication (27%), search has no specified budget (38%) and has less than 1 dedicated person working with search (48%).  More than 50% have a search strategy in place or are planning to have one in 2012/13.

These numbers I think are interesting, but definitely need to be segmented and analyzed further. That will of course be done in the report which is due to be ready in June.